GLORIA — GEOMAR Library Ocean Research Information Access

Treffer pro Seite

Treffer 1 - 1 | 1 Treffer

Alles auswählen Exportieren

Online-Ressource

Cooperative sequence clustering and decoding for DNA storage system with fountain codes

Jeong, Jaeho ; Park, Seong-Joon ; Kim, Jae-Won ; [weitere]

Oxford University Press (OUP) ; 2021

In: Bioinformatics Vol. 37, No. 19 ( 2021-10-11), p. 3136-3143

zur Merkliste hinzufügen auf der Merkliste

Details

In: Bioinformatics, Oxford University Press (OUP), Vol. 37, No. 19 ( 2021-10-11), p. 3136-3143

Kurzfassung: In DNA storage systems, there are tradeoffs between writing and reading costs. Increasing the code rate of error-correcting codes may save writing cost, but it will need more sequence reads for data retrieval. There is potentially a way to improve sequencing and decoding processes in such a way that the reading cost induced by this tradeoff is reduced without increasing the writing cost. In past researches, clustering, alignment and decoding processes were considered as separate stages but we believe that using the information from all these processes together may improve decoding performance. Actual experiments of DNA synthesis and sequencing should be performed because simulations cannot be relied on to cover all error possibilities in practical circumstances. Results For DNA storage systems using fountain code and Reed-Solomon (RS) code, we introduce several techniques to improve the decoding performance. We designed the decoding process focusing on the cooperation of key components: Hamming-distance based clustering, discarding of abnormal sequence reads, RS error correction as well as detection and quality score-based ordering of sequences. We synthesized 513.6 KB data into DNA oligo pools and sequenced this data successfully with Illumina MiSeq instrument. Compared to Erlich’s research, the proposed decoding method additionally incorporates sequence reads with minor errors which had been discarded before, and thus was able to make use of 10.6–11.9% more sequence reads from the same sequencing environment, this resulted in 6.5–8.9% reduction in the reading cost. Channel characteristics including sequence coverage and read-length distributions are provided as well. Availability and implementation The raw data files and the source codes of our experiments are available at: https://github.com/jhjeong0702/dna-storage.

Materialart: Online-Ressource

ISSN: 1367-4803 , 1367-4811

URL: Article

DOI: 10.1093/bioinformatics/btab246

Sprache: Englisch

Verlag: Oxford University Press (OUP)

Publikationsdatum: 2021

ZDB Id: 1468345-3

SSG: 12

	Standort	Signatur	Einschränkungen	Verfügbarkeit

Andere fanden auch interessant ...

Online-Ressource

Link zum Verlag

Treffer 1 - 1 | 1 Treffer