On conflict free DNA codes

PDF / 810,842 Bytes
29 Pages / 439.642 x 666.49 pts Page_size
24 Downloads / 361 Views

On conﬂict free DNA codes Krishna Gopal Benerjee1 · Sourav Deb1 · Manish K. Gupta1 Received: 28 October 2019 / Accepted: 24 September 2020 / © Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract DNA storage has emerged as an important area of research. The reliability of a DNA storage system depends on designing those DNA strings (called DNA codes) that are sufficiently dissimilar. In this work, we introduce DNA codes that satisfy the newly introduced constraint, a generalization of the non-homopolymers constraint. In particular, each codeword of the DNA code has the specific property that any two consecutive sub-strings of the DNA codeword will not be the same. This is apart from the usual constraints such as Hamming, reverse, reverse-complement and GC-content. We believe that the new constraints proposed in this paper will provide significant achievements in reducing the errors, during reading and writing data into the synthetic DNA strings. We also present a construction (based on a variant of stochastic local search algorithm) to determine the size of the DNA codes with a constraint that each DNA codeword is free from secondary structures in addition to the usual constraint. This further improves the lower bounds from the existing literature, in some specific cases. A recursive isometric map between binary vectors and DNA strings is also proposed. By applying this map over the well known binary codes, we obtain classes of DNA codes with all of the above constraints, including the property that the constructed DNA codewords are free from the hairpin like secondary structures. Keywords DNA codes · Homopolymers · Conflict free DNA strings · Hamming constraint · Reverse constraint · Reverse-complement constraint · GC-content constraint · Hairpin like secondary structures Mathematics Subject Classiﬁcation (2010) 68P30 · 94B60 · 94B65

The preliminary version of the paper is available at https://arxiv.org/abs/1902.04419 Manish K. Gupta

[email protected] Krishna Gopal Benerjee [email protected] Sourav Deb sourav [email protected] 1

Laboratory of Natural Information Processing, Dhirubhai Ambani Institute of Information and Communication Technology, Gandhinagar, Gujarat, India

Cryptography and Communications

1 Introduction The exponentially increasing demand in data storage forces to look into every possible option and DNA (DeoxyriboNucleic Acid) data storage has come out to be one of the most promising natural data storage for this purpose [11]. After the first striking implementation of large-scale archival DNA-based storage architecture by Church et al. [5] in 2012, followed by encoding scheme to DNA proposed by Goldman et al. [8] in 2013, researchers have taken great interests on the construction of DNA-based information storage systems [18, 20] because of its high storage density and longevity [5, 8, 36]. DNA consists of four types of bases or nucleotides (nts) called adenine (A), cytosine (C), guanine (G) and thymine (T ) where, the Watson-Cricks complementary bases for A and C

Data Loading...

On conflict free DNA codes

Recommend Documents

Combinatorial Bounds for Conflict-Free Coloring on Open Neighborhoods

Conflict-Free Learning for Mixed Integer Programming

High-performance biosensing based on autonomous enzyme-free DNA circuits

DNA Codes and Information: Formal Structures and Relational Causes

Towards systematic nomenclature for cell-free DNA

Nanogap Capacitors for Label Free DNA Analysis

Codes and Turbo Codes

Conflict-Free Coloring of Intersection Graphs of Geometric Objects

Notes on Cyclic Convolutional Codes

New EAQMDS codes constructed from negacyclic codes

DNA Computing 9th International Workshop on DNA Based Computers, DNA

GC-Rich DNA Fragments and Oxidized Cell-Free DNA Have Different Effects on NF-kB and NRF2 Signaling in MSC