Mapping flood susceptibility in an arid region of southern Iraq using ensemble machine learning classifiers: a comparati

  • PDF / 9,646,240 Bytes
  • 19 Pages / 595.276 x 790.866 pts Page_size
  • 50 Downloads / 175 Views

DOWNLOAD

REPORT


ORIGINAL PAPER

Mapping flood susceptibility in an arid region of southern Iraq using ensemble machine learning classifiers: a comparative study Alaa M. Al-Abadi 1 Received: 11 October 2017 / Accepted: 1 May 2018 # Saudi Society for Geosciences 2018

Abstract This study examined the efficacy of three machine ensemble classifiers, namely, random forest, rotation forest and AdaBoost, in assessing flood susceptibility in an arid region of southern Iraq. A dataset was created from flooded and non-flooded areas to train and validate the ensemble classifiers using a binary classification scheme (1—flood, 0—non-flood). The prepared dataset was then partitioned into two sets with a 70/30 ratio: 70% (2478 pixels) for training and 30% (1062 pixels) for testing. A total of 10 influential flood factors were selected and prepared based on data availability and a literature review. The selected factors were surface elevation, slope, plain curvature, topographic wetness index, stream power index, distance to rivers, drainage density, lithology, soil and land use/land cover. The information gain ratio was first utilised to explore the predictive abilities of the factors. The predictive performances of the three ensemble models were compared using six statistical measures: sensitivity, specificity, accuracy, kappa, root mean square error and area under the operating characteristics curve. The results revealed that the AdaBoost classifier was the best in terms of the statistical measures, followed by the random forest and rotation forest models. A flood susceptibility map was prepared based on the result of each classifier and classified into five zones: very low, low, moderate, high and very high. For the model with the best performance, i.e., the AdaBoost model, these zones were distributed over an area of 6002 km2 (44%) for the very low–low zone, 2477 km2 (18%) for the moderate zone and 5048 km2 (40%) for the high–very high zones. This study proved the high capabilities of ensemble machine learning classifiers to decipher flood susceptibility zones in an arid region. Keywords Information gain ratio . Maysan . ROC . Flood . Binary classifiers

Introduction Floods are among the most destructive threats on Earth (Ohl and Tapsell 2000). Each year, floods can affect millions of people around the world, causing deaths, destroying property and infrastructure and sometimes carrying fertile soil away from farming lands. Over the last few decades, floods have killed thousands and affected 1.4 billion people around the world (Jonkman 2005). According to Hirabayashi and Kanae (2009), floods are regarded as the greatest threats to human development and are estimated to affect approximately 20 to 300 million people every year. Floods in an arid region can develop in a short period (hours or less) that makes them

* Alaa M. Al-Abadi [email protected] 1

Department of Geology, College of Science, University of Basrah, Basrah, Iraq

dangerous to human and well-being (Ghoneim and Foody 2013). It is accepted that the severity and frequency of flo