Diagnosis of thyroid neoplasm using support vector machine algorithms based on platelet RNA-seq

  • PDF / 1,655,593 Bytes
  • 26 Pages / 595.276 x 790.866 pts Page_size
  • 55 Downloads / 171 Views

DOWNLOAD

REPORT


ORIGINAL ARTICLE

Diagnosis of thyroid neoplasm using support vector machine algorithms based on platelet RNA-seq Yuling Shen1 Yi Lai1 Dong Xu1 Le Xu1 Lin Song1 Jiaqing Zhou1 Chengwen Song2 Jiadong Wang1 ●













1234567890();,:

1234567890();,:

Received: 18 March 2020 / Accepted: 9 October 2020 © The Author(s) 2020

Abstract Objective To assess the capacity of support vector machine (SVM) algorithms that are developed based on platelet RNA-seq data in identifying thyroid neoplasm patients and differentiating patients with thyroid adenomas, papillary thyroid cancer and metastasized papillary thyroid cancer. Methods Platelets were collected and isolated from 109 patients and 63 healthy controls. RNA-seq was performed to find transcripts with differential levels. Genes corresponding to these altered transcripts were identified using R packages. All samples were subsampled into a training set and a validation set. Two SVM algorithms were developed and trained with the training set, using the genes with differential transcript levels (GDTLs) as classifiers, and validated with the validation set. GO and KEGG pathway enrichment analysis were performed using the R package clusterProfiler. Results We detected 765 GDTLs (442 up-regulated and 323 down-regulated) in platelets of patients and healthy controls. The algorithm identifying thyroid neoplasm patients achieved an accuracy of 97%, with an AUC (area under curve) of 0.998. The other algorithm differentiating patients with multiclass thyroid neoplasms had an average accuracy of 80.5%. GO analysis showed that GDTLs were strongly involved in biological processes such as neutrophil degranulation, neutrophil activation, autophagy and regulation of multi-organism process. KEGG pathway enrichment analysis revealed that GDTLs were mainly enriched in NOD-like receptor signaling pathway and pathways in endocytosis, osteoclast differentiation, human cytomegalovirus infection and tuberculosis. Conclusion Our results indicated that the combination of SVM algorithms and platelet RNA-seq data allowed for thyroid neoplasm diagnostics and multiclass thyroid neoplasm classification. Keywords Thyroid neoplasm SVM algorithm Platelet RNA-seq Bioinformatics analysis ●



Introduction Thyroid cancer is the most common endocrine cancer, with an increasing incidence of about two fold in the last 25 years and accounting for 2% of all cancers [1]. This overall

Supplementary information The online version of this article (https:// doi.org/10.1007/s12020-020-02523-x) contains supplementary material, which is available to authorized users. * Jiadong Wang [email protected] 1

Department of Head and Neck Surgery, Renji Hospital, School of Medicine, Shanghai Jiaotong University, 160 Pujian Road, Pudong District, Shanghai 200127, China

2

Fun-med Pharmaceutical Technology (Shanghai) Co., Ltd., RM. A310, 115 Xinjunhuan Road, Minhang District, Shanghai 201100, China



incidence growth is driven by the widespread use of sensitive imaging techniques [2, 3]. The growth in incidence without c