Automated digital data acquisition for landslide inventories
- PDF / 2,496,791 Bytes
- 11 Pages / 595.276 x 790.866 pts Page_size
- 53 Downloads / 250 Views
Thomas M. Kreuzer I Bodo Damm
Automated digital data acquisition for landslide inventories
Abstract Landslide research relies on landslide inventories for a multitude of spatial, temporal, or process analyses. Generally, it takes high effort to populate a landslide inventory with relevant data. In this context, the present work investigated an effective way to handle vast amounts of automatically acquired digital data for landslide inventories by the use of machine learning algorithms and information filtering. Between July 2017 and February 2019, a keyword alert system provided 4381 documents that were automatically processed to detect landslide events in Germany. Of all those documents, 91% were automatically recognized as irrelevant or duplicates; thereby, the data volume was significantly reduced to contain only actual landslide documents. Moreover, it was shown that inclusion of the document’s images into the automated process chain for information filtering is recommended, since otherwise unobtainable important information was found in them. Compared with manual methods, the automated process chain eliminated personal idiosyncrasies and human error and replaced it with a quantifiable machine error. The applied individual algorithms for natural language processing, information retrieval, and classification have been tried and tested in their respective fields. Furthermore, the proposed method is not restricted to a specific language or region. All languages on which these algorithms are applicable can be used with the proposed method and the training of the process chain can take any geographical restriction into account. Thus, the present work introduced a method with a quantifiable error to automatically classify and filter large amounts of data during automated digital data acquisition for landslide inventories. Keywords Landslide inventory . Data acquisition . Machine learning . Document classification . Information filtering Introduction Landslide research chiefly relies on landslide inventories (here synonymous with databases) for a multitude of spatial, temporal, or process analyses ( Van Den Eeckhaut and Hervás 2012; Klose et al. 2015). Generally, it takes high effort to populate a landslide inventory with relevant data. Therefore, researchers have applied different strategies that, following a similar classification as Guzzetti et al. (2012), can be differentiated into two main categories: for one, data derived from morphological examination by fieldwork, remote sensing products, or cartographic analysis and, secondly, data derived from textual sources and, if present henceforth always considered, their accompanying images (usually ground images). Such textual sources can be acquired from scientific publications, reports of varying agencies (e.g., civil protection, police, building authorities, road construction offices), newspaper articles, and unpublished documents (e.g., church records or historical archives). Overall, data acquisition from textual sources is an effective method (Wohlers et al. 20
Data Loading...