I-Impute: a self-consistent method to impute single cell RNA sequencing data

PDF / 3,298,025 Bytes
9 Pages / 595 x 791 pts Page_size
83 Downloads / 309 Views

METHODOLOGY

Open Access

I-Impute: a self-consistent method to impute single cell RNA sequencing data Xikang Feng1,2† , Lingxi Chen2† , Zishuai Wang2 and Shuai Cheng Li2,3* From The 18th Asia Pacific Bioinformatics Conference Seoul, Korea. 18-20 August 2020

Abstract Background: Single-cell RNA-sequencing (scRNA-seq) is becoming indispensable in the study of cell-specific transcriptomes. However, in scRNA-seq techniques, only a small fraction of the genes are captured due to “dropout” events. These dropout events require intensive treatment when analyzing scRNA-seq data. For example, imputation tools have been proposed to estimate dropout events and de-noise data. The performance of these imputation tools are often evaluated, or fine-tuned, using various clustering criteria based on ground-truth cell subgroup labels. This limits their effectiveness in the cases where we lack cell subgroup knowledge. We consider an alternative strategy which requires the imputation to follow a “self-consistency” principle; that is, the imputation process is to refine its results until there is no internal inconsistency or dropouts from the data. Results: We propose the use of “self-consistency” as a main criteria in performing imputation. To demonstrate this principle we devised I-Impute, a “self-consistent” method, to impute scRNA-seq data. I-Impute optimizes continuous similarities and dropout probabilities, in iterative refinements until a self-consistent imputation is reached. On the in silico data sets, I-Impute exhibited the highest Pearson correlations for different dropout rates consistently compared with the state-of-art methods SAVER and scImpute. Furthermore, we collected three wetlab datasets, mouse bladder cells dataset, embryonic stem cells dataset, and aortic leukocyte cells dataset, to evaluate the tools. I-Impute exhibited feasible cell subpopulation discovery efficacy on all the three datasets. It achieves the highest clustering accuracy compared with SAVER and scImpute. Conclusions: A strategy based on “self-consistency”, captured through our method, I-Impute, gave imputation results better than the state-of-the-art tools. Source code of I-Impute can be accessed at https://github.com/ xikanfeng2/I-Impute. Keywords: scRNA-seq, Imputation, Self-consistency, Cell subpopulation identification

*Correspondence: [email protected] † Xikang Feng and Lingxi Chen contributed equally to this work. 2 Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, China 3 Department of Biomedical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong, China Full list of author information is available at the end of the article © The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licen

Data Loading...

I-Impute: a self-consistent method to impute single cell RNA sequencing data

Recommend Documents

Single-Cell RNA Sequencing with Drop-Seq

A systematic evaluation of single-cell RNA-sequencing imputation methods

Lineage Inference and Stem Cell Identity Prediction Using Single-Cell RNA-Sequencing Data

Full-Length Single-Cell RNA Sequencing with Smart-seq2

Integration of Single-Cell RNA-Sequencing Data into Flux Balance Cellular Automata

Single-Cell 5fC Sequencing

Bayesian gamma-negative binomial modeling of single-cell RNA sequencing data

Statistical and Bioinformatics Analysis of Data from Bulk and Single-Cell RNA Sequencing Experiments

The Potential of Single Cell RNA-Sequencing Data for the Prediction of Gastric Cancer Serum Biomarkers

NDRindex: a method for the quality assessment of single-cell RNA-Seq preprocessing data

CIPR: a web-based R/shiny app and R package to annotate cell clusters in single cell RNA sequencing experiments

Single Molecule and Single Cell Sequencing