Journal of Genomes and Exomes 2013:2 1-18
Original Research
Published on 18 Feb 2013
DOI: 10.4137/JGE.S10089
Sign up for email alerts to receive notifications of new articles published in Journal of Genomes and Exomes
The ability to target and capture known exons in the human genome, and characterize them by massively parallel sequencing, has led to the identification of the genetic causes of many Mendelian disorders. Several factors suggest that exome sequencing will be the preferred clinical next generation technology for some time to come. Advantages of high sequencing depth include the low cost/coverage compared with genome sequencing, and the fact that non-coding-sequence interpretation is still in the early stages of development. In this study of data from the NIH Undiagnosed Diseases Program (UDP), we investigated a novel approach to quantify the quality of exome sequencing data. We systematically and thoroughly evaluated the genotypable fraction across well-characterized protein-coding exons and found that >88% are genotyped to completion and, on average, >93% of all coding bases were genotyped (with target sequencing efficiency of 96%). We also demonstrate a methodology for robust identification of consistently genotyped exons using a new statistical metric, the index of dispersion. This methodology allowed us to define the overall genotypeability of all 167,717 autosomal exons and 95.5% of these had a reproducible pattern of sequencing. Finally, we developed a computational application to take advantage of the reproducible and predictable pattern to confidently detect homozygous deletion events of protein-coding exons. We exploited the sequence pattern information towards reduction of search complexity to detect homozygous deletion events. Of our 11 predictions of homozygous exon-deletion events, we studied 3, performing wet lab experiments that confirmed and validated each of them. We conclude that our systematic approach to analyzing exome sequence data across our patient cohort provides a powerful computational methodology to evaluate, assess, interpret and predict patterns that are relevant to the pathophysiology of the sequenced individuals.
PDF (3.05 MB PDF FORMAT)
RIS citation (ENDNOTE, REFERENCE MANAGER, PROCITE, REFWORKS)
BibTex citation (BIBDESK, LATEX)
My experience with Libertas Academica was very positive from submission to acceptance. The reviewers' comments were very interesting and constructive. The author interface was user-friendly and very effective. The publishing process was fast and convenient. I recommend this journal.
Facebook Google+ Twitter
Pinterest Tumblr YouTube