Design and Implementation of System Which Efficiently Retrieve Useful Data for Detection of Dementia Disease
To analyze Hadoop techniques like MapReduce, which will help to process the data faster and in efficient way to detect dementia. For given voluminous dementia dataset, current solution uses different data partitioning strategies which experiences large co
- PDF / 426,649 Bytes
- 9 Pages / 439.37 x 666.142 pts Page_size
- 56 Downloads / 184 Views
Abstract To analyze Hadoop techniques like MapReduce, which will help to process the data faster and in efficient way to detect dementia. For given voluminous dementia dataset, current solution uses different data partitioning strategies which experiences large communication cost and expensive mining process due to duplicate and unnecessary transactions transferred among computing nodes. To clear this issues proposed algorithm uses data partitioning techniques such as Min-Hash and Locality Sensitive Hashing which will reduce processing time and improve efficiency of final result. We are taking help of MapReduce programming model of Hadoop [3]. We implement this technique on a Hadoop platform. For pattern matching we use FPgrowth algorithm. Finally we shows that the proposed system requires less time to finding frequent item sets. The idea behind research is to adopt to cope with the special requirement of health domain related with patients. Keywords MapReduce
Moving K-means algorithm Fp-Growth algorithm
1 Introduction The Classical technique of parallel mining algorithms concentrated on uniform distribution of data across computing nodes. It uniformly partitioned and assigned data to clusters of computing nodes [2]. Due to redundancy cost of network traffic and data shuffling increases which in turn minimizes the productiveness of data partitioning. S. Waghere (&) P. RajaRajeswari V. Ganesan Deparment of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, A.P., India e-mail: [email protected] P. RajaRajeswari e-mail: [email protected] V. Ganesan e-mail: [email protected] © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 A. Kumar and S. Mozar (eds.), ICCCE 2020, Lecture Notes in Electrical Engineering 698, https://doi.org/10.1007/978-981-15-7961-5_144
1603
1604
S. Waghere et al.
Improper data partitioning decisions not only affect network and computing overhead but also face problem in load balancing. The key plan of proposed technique is to form single partition of extremely correlated transactions; thus the number of irrelevant transactions is gradually sliced. Basically partitioning and distribution of voluminous dataset across data nodes of a Hadoop cluster will be in such a way that it will reduces the network and computing loads due to duplication of transactions on remote nodes. This is contributing in speed up the performance of mining process on clusters. By using this approach, we can classify the dementia dataset in different severity rating based clusters. It will show that which symptoms are responsible for the severity of Dementia like age, stress, depression etc. It will be helpful to group the highly related symptoms to that severity rating of Dementia.
2 Motivations The following examinations give motivation and direction to the research. Around Worldwide approximately 5cr individuals have dementia, and there are nearly 2cr new cases each year. Alzheimer disease is
Data Loading...