Title | Utility of long-read sequencing for All of Us. |
Publication Type | Journal Article |
Year of Publication | 2024 |
Authors | Mahmoud, M, Huang, Y, Garimella, K, Audano, PA, Wan, W, Prasad, N, Handsaker, RE, Hall, S, Pionzio, A, Schatz, MC, Talkowski, ME, Eichler, EE, Levy, SE, Sedlazeck, FJ |
Journal | Nat Commun |
Volume | 15 |
Issue | 1 |
Pagination | 837 |
Date Published | 2024 Jan 29 |
ISSN | 2041-1723 |
Keywords | Genome, Human, High-Throughput Nucleotide Sequencing, Humans, INDEL Mutation, Population Health, Sequence Analysis, DNA |
Abstract | The All of Us (AoU) initiative aims to sequence the genomes of over one million Americans from diverse ethnic backgrounds to improve personalized medical care. In a recent technical pilot, we compare the performance of traditional short-read sequencing with long-read sequencing in a small cohort of samples from the HapMap project and two AoU control samples representing eight datasets. Our analysis reveals substantial differences in the ability of these technologies to accurately sequence complex medically relevant genes, particularly in terms of gene coverage and pathogenic variant identification. We also consider the advantages and challenges of using low coverage sequencing to increase sample numbers in large cohort analysis. Our results show that HiFi reads produce the most accurate results for both small and large variants. Further, we present a cloud-based pipeline to optimize SNV, indel and SV calling at scale for long-reads analysis. These results lead to widespread improvements across AoU. |
DOI | 10.1038/s41467-024-44804-3 |
Alternate Journal | Nat Commun |
PubMed ID | 38281971 |
PubMed Central ID | PMC10822842 |
Grant List | OT2 OD026556 / OD / NIH HHS / United States U2C OD023196 / OD / NIH HHS / United States OT2 OD025315 / OD / NIH HHS / United States OT2 OD026551 / OD / NIH HHS / United States U24 OD023121 / OD / NIH HHS / United States OT2 OD026552 / OD / NIH HHS / United States OT2 OD026549 / OD / NIH HHS / United States OT2 OD025337 / OD / NIH HHS / United States OT2 OD025277 / OD / NIH HHS / United States OT2 OD026555 / OD / NIH HHS / United States OT2 OD026550 / OD / NIH HHS / United States OT2 OD026553 / OD / NIH HHS / United States OT2 OD023205 / OD / NIH HHS / United States OT2 OD025276 / OD / NIH HHS / United States OT2 OD026554 / OD / NIH HHS / United States U24 OD023163 / OD / NIH HHS / United States OT2 OD023206 / OD / NIH HHS / United States OT2 OD002748 / OD / NIH HHS / United States U24 OD023176 / OD / NIH HHS / United States OT2 OD026548 / OD / NIH HHS / United States OT2 OD026557 / OD / NIH HHS / United States OT2 OD002751 / OD / NIH HHS / United States |
Utility of long-read sequencing for All of Us.
Similar Publications
A comparative study of structural variant calling in WGS from Alzheimer's disease families. Life Sci Alliance. 2024;7(5). | .
Genetic sex validation for sample tracking in next-generation sequencing clinical testing. BMC Res Notes. 2024;17(1):62. | .
The CARD8 inflammasome dictates HIV/SIV pathogenesis and disease progression. Cell. 2024;187(5):1223-1237.e16. | .