Classifying univariate uncertain data
- PDF / 1,772,250 Bytes
- 29 Pages / 595.276 x 790.866 pts Page_size
- 70 Downloads / 237 Views
Classifying univariate uncertain data Ying-Ho Liu 1 & Huei-Yu Fan 1
# Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract In the literature, univariate uncertain data has a quantitative interval for each attribute in each transaction, which is accompanied by a probability density function indicating the probability that each value in the interval exists and appears. To the best of our knowledge, classifying univariate uncertain data has thus far seldom been addressed in the literature. Here, we propose the AssoU2Classifier algorithm to address this research gap. The AssoU2Classifier algorithm retrieves association rules from the univariate uncertain data to serve as a classification model. In addition, the U2Pruning procedure is developed to prune the association rules. The U2Pruning procedure not only reduces the number of association rules, which considerably accelerates the classification process, but also achieves high classification accuracies. In the experiments, the AssoU2Classifier algorithm was compared with 14 existing algorithms on 12 modified UCI datasets. The AssoU2Classifier algorithm obtained better classification accuracy than the compared algorithms on most of the datasets. Statistical tests (Friedman test and pairwise Wilcoxon test) also justified the advantage of the AssoU2Classifier algorithm. In addition, the AssoU2Classifier algorithm also had average learning time. Keywords Univariate uncertain data . Classification . Associative classification
1 Introduction Owing to the advancement of technologies, devices for automatically recording data, e.g., street cameras and gas sensors, and database techniques have become far more mature than those used decades ago. Huge amounts of data are stored in many kinds of data repositories and can be easily accessed. Because of the limitations of recording devices, entry errors, and some other reasons, the collected data may contain uncertainty, i.e., uncertain data. Uncertain data contains one or more attributes whose values are not precise numbers; instead, an attribute value may be a number with a probability of occurrence or a quantitative interval. A special type of uncertain data, i.e., univariate uncertain data [1], has a quantitative interval for each attribute in each transaction, which is
* Ying-Ho Liu [email protected] Huei-Yu Fan [email protected] 1
Department of Information Management, National Dong Hwa University, No. 1, Sec. 2, Da Hsueh Road, Hualien 97401, Taiwan, Republic of China
accompanied by a probability density function indicating the probability that each value in the interval exists and appears. A transaction involving univariate uncertain data is hereinafter referred to as a univariate uncertain transaction. Univariate uncertain data exists in many real-world applications, e.g., medical data, market data, and environmental monitoring data. For instance, a weather instrument records temperature, humidity, air pressure, etc. The weather instrument may not be able to record a precise reading fo
Data Loading...