The effect of data quality on model performance with application to daily evaporation estimation
- PDF / 566,167 Bytes
- 11 Pages / 595.276 x 790.866 pts Page_size
- 72 Downloads / 181 Views
ORIGINAL PAPER
The effect of data quality on model performance with application to daily evaporation estimation Ming-Chang Wu • Gwo-Fong Lin • Hsuan-Yu Lin
Published online: 2 March 2013 Springer-Verlag Berlin Heidelberg 2013
Abstract The model performance is usually influenced by the quality of the data used in model construction. If the model performance is less affected by data quality, the resulting estimates will be more reliable. In this paper, the variation in model performance due to different data quality is explored in a field-scale application. Hence, two models, the proposed support vector machine (SVM) based model and the Stephen and Stewart (SS) model, are employed for daily estimation of evaporation at an experiment station. Five scenarios corresponding to different data qualities are designed to evaluate the effect of data quality on model performance. Additionally, the most effective meteorological variables influencing evaporation are obtained by a systematic input determination process. These most effective meteorological variables are used as inputs to the SVM-based model. The results show that the model performance decreases as the data quality decreases (i.e. the percentage of missing data increases). However, the estimation accuracy of SVM-based models is still better than that of the SS model. Moreover, the variation of model performance of the SVM-based model is smaller than that of the SS model. That is, the negative impact of different data quality is effectively decreased by using the SVMbased model instead of the SS model.
M.-C. Wu G.-F. Lin (&) H.-Y. Lin Department of Civil Engineering, National Taiwan University, Taipei 10617, Taiwan e-mail: [email protected] M.-C. Wu Taiwan Typhoon and Flood Research Institute, National Applied Research Laboratories, Taipei 10093, Taiwan
Keywords Data quality Missing data Optimal input combination Daily evaporation Meteorological data Support vector machine
1 Introduction Evaporation is the process by which liquid water is converted into water vapor by heat and transferred to the atmosphere from the evaporating surface. Evaporation estimation is always required as an important reference for water resources management and planning. In practice, the daily evaporation can be observed using Class A Pan which is the most widely used instrument. However, sometimes the observed evaporation data are not available. The daily evaporation data may be lost due to measurement or recording failure. Traditionally, these missing data can be estimated by using available meteorological data. There are numerous meteorological variables affecting the process of evaporation, such as air temperature, solar radiation (SR), humidity, rainfall and wind speed (Allen et al. 1998). In general, the evaporation increases with increasing temperature and wind speed, but decreases with increasing humidity. Nevertheless, all the meteorological variables are interacting with each other. For example, humidity decreases with increasing wind speed, and temperature is a
Data Loading...