한빛사 논문
Shibiao Wan1,2,3, Junil Kim1,2,4,5 and Kyoung Jae Won1,2,4,5*
1Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, 19104, Pennsylvania, USA
2Department of Genetics, University of Pennsylvania, Philadelphia, 19104, Pennsylvania, USA
3Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, TN 38105
4Biotech Research and Innovation Centre (BRIC), University of Copenhagen, Ole Maaløes Vej 5. 2200 Copenhagen N, Denmark
5Novo Nordisk Foundation Center for Stem Cell Biology, DanStem, Faculty of Health and Medical
Sciences, University of Copenhagen, Ole Maaløes Vej 5. 2200 Copenhagen N, Denmark
*Corresponding author : Kyoung Jae Won
Abstract
To process large-scale single-cell RNA-sequencing (scRNA-seq) data effectively without excessive distortion during dimensionreduction, we present SHARP, an ensemble random projection-based algorithm which is scalable to clustering 10 million cells.Comprehensive benchmarking tests on 17 public scRNA-seq datasets demonstrate that SHARP outperforms existing methods in termsof speed and accuracy. Particularly, for large-size datasets (>40,000 cells), SHARP runs faster than other competitors whilemaintaining high clustering accuracy and robustness. To the best of our knowledge, SHARP is the only R-based tool that isscalable to clustering scRNA-seq data with 10 million cells.
Keywords : Single cell analysis; 10 million single cells; ensemble clustering; random projection; dimension reduction; scRNA-seq
논문정보
관련 링크
관련분야 연구자보기
소속기관 논문보기
관련분야 논문보기
해당논문 저자보기