Mechanics of Long Read Sequencing Failure
Next generation sequencing (NGS) technology has revolutionized the field of genomics. Long read sequencing particularly has become an invaluable tool for more complete genomic assembly and identification of genomic features that otherwise go undetected in short-read assemblies. Though cost of long read sequencing has dropped substantially in recent years, next generation sequencing remains expensive, and failed sequencing runs represent a significant expense to genomics facilities, resulting not only in wasted resources, but also irreversible loss of sample. This speaks to the need to identify the causes of failure, quantify and rectify them. Though there are several potential causes of poor performance of PacBio sequencing, including ZMW loading efficiency, we hypothesize that nicks, single-stranded breaks in the sugar-phosphate backbone of DNA, disrupt the circular structure (SMRTbell) employed in PacBio sequencing and present a significant challenge to SMRT sequencing, resulting in incomplete sequence capture and failure to generate a circular consensus sequence.
This work was presented in two posters at AGBT 2016 (link forthcoming).