NCMHap: a novel method for haplotype reconstruction based on Neutrosophic c-means clustering
- PDF / 1,611,810 Bytes
- 15 Pages / 595.276 x 790.866 pts Page_size
- 54 Downloads / 241 Views
METHODOLOGY ARTICLE
Open Access
NCMHap: a novel method for haplotype reconstruction based on Neutrosophic c‑means clustering Fatemeh Zamani1, Mohammad Hossein Olyaee2 and Alireza Khanteymoori1* *Correspondence: [email protected] 1 Department of Computer Engineering, University of Zanjan, Zanjan, Iran Full list of author information is available at the end of the article
Abstract Background: Single individual haplotype problem refers to reconstructing haplotypes of an individual based on several input fragments sequenced from a specified chromosome. Solving this problem is an important task in computational biology and has many applications in the pharmaceutical industry, clinical decision-making, and genetic diseases. It is known that solving the problem is NP-hard. Although several methods have been proposed to solve the problem, it is found that most of them have low performances in dealing with noisy input fragments. Therefore, proposing a method which is accurate and scalable, is a challenging task. Results: In this paper, we introduced a method, named NCMHap, which utilizes the Neutrosophic c-means (NCM) clustering algorithm. The NCM algorithm can effectively detect the noise and outliers in the input data. In addition, it can reduce their effects in the clustering process. The proposed method has been evaluated by several benchmark datasets. Comparing with existing methods indicates when NCM is tuned by suitable parameters, the results are encouraging. In particular, when the amount of noise increases, it outperforms the comparing methods. Conclusion: The proposed method is validated using simulated and real datasets. The achieved results recommend the application of NCMHap on the datasets which involve the fragments with a huge amount of gaps and noise. Keywords: Bioinformatics, Haplotype assembly, Minimum error correction, Neutrosophic c-means clustering
Background It has been revealed that the human genome shows some degrees of inter-individual and inter-population variations which make it an appropriate target to rigorous functional genomic analysis [1, 2]. Recent cost-effective next-generation sequencing (NGS) technologies have provided a huge amount of genome sequences of individual human [3]. It has been discovered that more than 99% of human genomes are completely identical. Therefore, it turns out that the vast differences among people can be emerged from less than 1% variations [4, 5]. Single nucleotide polymorphisms (SNPs) refer to the genetic
© The Author(s) 2020. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the materi
Data Loading...