Names of group members are in bold.
2025
M.H. Tan*, K.E. Tiedje*, Q. Feng, Q. Zhan, M. Pascual, H. Shim, Y. Chan, and K.P. Day, A paradoxical population structure of var DBL types in Africa (*: co-first authors), PLOS Pathogens, 2025 [link]
A. Vannan*, R. Lyu*, A.L. Williams, N.M. Negretti, E.D. Mee, J. Hirsh, S. Hirsh, D.S. Nichols, C.L. Calvi, C.J. Talor, V.V Polosukhin, A.P.M. Serezani, A.S. McCall, J.J. Gokey, H. Shim, L.B. Ware, M.J. Bacchetta, C.M. Shaver, T.S. Blackwell, J.A. Kropski, R. Walia, J.M.S. Sucre, D.J. McCarthy, N.E. Banovich, Spatial transcriptomics identifies molecular niche dysregulation associated with distal lung remodeling in pulmonary fibrosis (*: equal contribution), Nature Genetics, 2025 [link]
2024
A.W.C. Kwok, H. Shim#, D.J. McCarthy#, Going beyond cell clustering and feature aggregation: Is there single cell level information in single-cell ATAC-seq data? (#: co-supervision), bioRxiv, 2024 [link]
H. Shim, Z. Xing, E. Pantaleo, F. Luca, R. Pique-Regi, M. Stephens, Multi-scale Poisson process approaches for differential expression analysis of high-throughput sequencing data. AOAS, 2024 [link][supplementary material] [software multiseq][code for analysis]
A. Moore, H. Shim, J Zhu, M Gong, Semi-Supervised Learning Under General Causal Models, IEEE Transactions on Neural Networks and Learning Systems, 2024 [link]
2023 S. Mangiola, A. Schulze*, M. Trussart*, E. Zozaya*, M. Ma, Z. Gao, A. F. Rubin, T. P. Speed#, H. Shim#, A. T. Papenfuss#, Robust differential composition and variability analysis for multisample cell omics (*: These authors contributed equally; #: These authors contributed equally), Proceedings of the National Academy of Sciences (PNAS), 2023 [link][R package sccomp] Y. You*, Y.D.J. Prawer*, R. De Paoli-Iseppi, C.P.J. Hunt, C.L. Parish, H. Shim#, M.B. Clark#, Identification of cell barcodes from long-read single-cell RNA-seq with BLAZE (*: equal contribution; #: joint corresponding authors), Genome Biology, 2023 [link][software BLAZE] H. Sun, H. Shim#, V. Rao#, Detecting Jumps on a Tree: a Hierarchical Pitman-Yor Model for Evolution of Phenotypic Distributions (#: co-supervision), arXiv, 2023 [link][software treeHPYP] M.H. Tan, H. Shim, Y. Chan, K.P. Day, Unravelling var complexity: Relationship between DBL types and var genes in Plasmodium falciparum, Frontiers in Parasitology, 2023 [link] 2022 R. Lyu, V. Tsui, W. Crismani, R. Liu, H. Shim#, D.J. McCarthy#, sgcocaller and comapr: personalised haplotype assembly and comparative crossover map analysis us- ing single-gamete sequencing data (#: co-supervision), Nucleic Acids Research, 2022 [link][software sgcocaller][software comapr in github][software comapr in Bioconductor][code for analysis 1][code for analysis 2] L. Hui, D.J. McCarthy#, H. Shim#, S. Wei#, Trade-off between conservation of biological variation and batch effect removal in deep generative modeling for single- cell transcriptomics (#: co-supervision), BMC Bioinformatics, 2022 [link] Y. You, M. B. Clark#, H. Shim#, NanoSplicer: Accurate identification of splice junctions using Oxford Nanopore sequencing (#: joint corresponding authors), Bioinformatics, 2022 [link][supplementary material] [software NanoSplicer] Q. Feng, K. Tiedje, S. Ruybal-Pes ́antez, G. Tonkin-Hill, M. Duffy, K. Day, H. Shim#, Y. Chan#, An accurate method for identifying recent recombinants from unaligned sequences (#: co-supervision), Bioinformatics, 2022 [link][software] 2021 Y. S. Foo and H. Shim, A Comparison of Bayesian Inference Techniques for Sparse Factor Analysis.[link][Implementation of proposed algorithms] I. Alqassem, Y. Sonthalia, E. Klitzke, H. Shim#, S. Canzar#, McSplicer: a probabilistic model for estimating splice site usage from RNA-seq data (#: joint corresponding authors), Bioinformatics, 2021 [link][software McSplicer] 2018 A. G. Shanku, A. Findley, C. A. Kalita, H. Shim, F. Luca, R. Pique-Regi, circuitSNPs: Predicting genetic effects using a Neural Network to model regulatory modules of DNase-seq footprints [link] 2017 I. E. Schor, J. F. Degner, D. Harnett, E. Cannavo, F. P. Casale, H. Shim, D. Garfield, E. Birney, M. Stephens, O. Stegle, E. E. Furlong, Promoter shape varies across populations and affects promoter evolution and expression noise, Nature Genetics, 2017, February 13, doi:10.1038/ng.3791 [link] H. Shim and B. Larget, BayesCAT: Bayesian Co-estimation of Alignment and Tree, Biometrics. 2017 Jan 18. doi: 10.1111/biom.12640 [link][supplementary materials][software BayesCAT] 2016 A. Raj*, S. Wang*, H. Shim*, A. Harpak, Y. I. Li, B. Englemann, M. Stephens, Y. Gilad, J. K. Pritchard, Thousands of novel translated open reading frames in humans inferred by ribosome footprint profiling, eLife 2016;10.7554/eLife.13328. (*: co-first authors) [link][software riboHMM] 2015 A. Raj*, H. Shim*, Y. Gilad, J. K. Pritchard and M. Stephens, msCentipede: Modeling heterogeneity across genomic sites improves accuracy in the inference of transcription factor binding, PLoS ONE 10(9): e0138030, 2015. (*: co-first authors)[link][software msCentipede] H. Shim and M. Stephens, Wavelet-based genetic association analysis of functional phenotypes arising from high-throughput sequencing assays, Ann. Appl. Stat. 9 (2015), no. 2, 665–686.[pdf][link][software WaveQTL][supplementary materials][supplementary figures] H. Shim, D. I. Chasman, J. D. Smith, S. Mora, P. M. Ridker, D. A. Nickerson, R. M. Krauss, M. Stephens, A multivariate genome-wide association analysis of 10 LDLsubfractions, and their response to statin treatment, in 1868 Caucasians, PLoS ONE 10(4): e0120758, 2015. [link][software mvBIMBAM] Previous L. M. Mangravite, B. E. Engelhardt, M. W. Medina, J. D. Smith, C. D. Brown, D. I. Chasman, B. H. Mecham, B. Howie, H. Shim, D. Naidoo, Q. Feng, M. J. Rieder, Y. D. Chen, J. I. Rotter, P. M. Ridker, J. C. Hopewell, S. Parish, J. Armitage, R. Collins, R. A. Wilke, D. A. Nickerson, M. Stephens, R. M. Krauss. A statin-dependent QTL for GATM expression is associated with statin-induced myopathy, Nature, 502:377–380, 2013. [link] H. Shim*, H. Chun*, C. D. Engelman, and B. A. Payseur. Genome-wide association studies using SNPs vs. haplotypes: An empirical comparison with data from the North American Rheumatoid Arthritis Consortium, BMC Proceedings, 3 (Suppl 7):S35, 2009 (*: co-first authors) [link] H. Shim and S. Keles. Integrating quantitative information from ChIP-chip experiments into motif finding, Biostatistics, 9(1):51-65, 2007. [link][software SUCcESS]