Effective and efficient classification of gastrointestinal lesions: combining data preprocessing, feature weighting, and
- PDF / 1,090,639 Bytes
- 16 Pages / 595.276 x 790.866 pts Page_size
- 68 Downloads / 191 Views
ORIGINAL RESEARCH
Effective and efficient classification of gastrointestinal lesions: combining data preprocessing, feature weighting, and improved ant lion optimization Dalwinder Singh1 · Birmohan Singh1 Received: 9 January 2019 / Accepted: 24 October 2020 © Springer-Verlag GmbH Germany, part of Springer Nature 2020
Abstract This paper presents an approach that combines the data preprocessing, feature weighting, and the improved antlion optimization algorithm for effective and efficient classification of gastrointestinal lesions. A high-dimensional gastrointestinal lesion dataset that consists of extracted texture, color, and shape features from the colonoscopy videos is obtained from the UCI repository. The data has certain imperfections such as the presence of zero-valued features, outliers, and dominant features. So, it is preprocessed to cope with these problems. Then, feature weighting is used to boost the classification performance by assigning the weights to the features according to their relevance in classification. The improved antlion optimization algorithm is used to search for feature weights and the parameters of the Support Vector Machines simultaneously. The experiments are performed using different combinations of features and endoscopic images to analyse the performance. The outcomes show that the combination of texture and color features from NBI images is the best. The accuracy of 97.37% and 98.68% for multi-class and binary classification problems respectively is attained using only ∼31% features. Moreover, feature reduction helps to lower the runtime of the classifier by approx. 60%. In conclusion, a better approach is presented for colorectal lesions classification that competes with well-experienced colonoscopists and outperforms the existing methods. Keywords Colorectal cancer · Data preprocessing · High dimensional data · Medical data classification · Support vector machines
1 Introduction Colorectal cancer is the third leading cause of death worldwide after lung and liver cancer (Organization 2018). It is a form of cancer that develops in the colon (large intestine) or rectum. The risk factors for developing this cancer include age, inherited genetic risk, personal history of adenomatous polyps and inflammatory bowel disease (IBD), family history of colorectal cancer or adenomatous polyps, Electronic supplementary material The online version of this article (https://doi.org/10.1007/s12652-020-02629-0) contains supplementary material, which is available to authorized users. * Birmohan Singh [email protected] Dalwinder Singh [email protected] 1
Sant Longowal Institute of Engineering and Technology, Longowal, Punjab, India
smoking, heavy alcohol consumption, and obesity (Haggar and Boushey 2009). Colorectal adenocarcinoma occurs mostly in persons with age >50 years whereas its chances are unlikely for the persons with age 0 ||𝐰||2 + C 2 i=1
(3)
where C is the penalty parameter of the error term. Further, the kernel function is represented as follows: K(𝐱i ,
Data Loading...