In:
Bioinformatics, Oxford University Press (OUP), Vol. 36, No. 16 ( 2020-08-15), p. 4415-4422
Abstract:
Models for analysing and making relevant biological inferences from massive amounts of complex single-cell transcriptomic data typically require several individual data-processing steps, each with their own set of hyperparameter choices. With deep generative models one can work directly with count data, make likelihood-based model comparison, learn a latent representation of the cells and capture more of the variability in different cell populations. Results We propose a novel method based on variational auto-encoders (VAEs) for analysis of single-cell RNA sequencing (scRNA-seq) data. It avoids data preprocessing by using raw count data as input and can robustly estimate the expected gene expression levels and a latent representation for each cell. We tested several count likelihood functions and a variant of the VAE that has a priori clustering in the latent space. We show for several scRNA-seq datasets that our method outperforms recently proposed scRNA-seq methods in clustering cells and that the resulting clusters reflect cell types. Availability and implementation Our method, called scVAE, is implemented in Python using the TensorFlow machine-learning library, and it is freely available at https://github.com/scvae/scvae. Supplementary information Supplementary data are available at Bioinformatics online.
Type of Medium:
Online Resource
ISSN:
1367-4803
,
1367-4811
DOI:
10.1093/bioinformatics/btaa293
Language:
English
Publisher:
Oxford University Press (OUP)
Publication Date:
2020
detail.hit.zdb_id:
1468345-3
SSG:
12
Permalink