Supporting data for "Accurate gene consensus at low nanopore coverage"
In this work, we develop SINGLe (SNPs In Nanopore reads of Gene Libraries), an error correction method to reduce the noise in nanopore reads of amplicons containing point variations. SINGLe exploits that in an amplicon library, all reads are very similar to a wild type sequence, from which it is possible to experimentally characterize the position-specific systematic sequencing error pattern. Then, it uses this information to reweight the confidence given to nucleotides that do not match the wild type in individual variant reads, and incorporates it on the consensus calculation.
We tested SINGLe in a mutagenic library of the KlenTaq polymerase gene, where the true mutation rate was below the sequencing noise. We observed that contrary to other methods, SINGLe compensates for the systematic errors made by the basecallers. Consequently, SINGLe converges to the true sequence using as little as 5 reads per variant, fewer than other available methods.