Home  | Publications | SWH25

Transformer-Based Decoding in Concatenated Coding Schemes Under Synchronization Errors

MCML Authors

Abstract

We consider the reconstruction of a codeword from multiple noisy copies, each independently corrupted by insertion, deletion, and substitution errors. This problem arises, for example, in DNA data storage. A common code construction uses a concatenated code with an outer linear block code and an inner marker code, decoded via Belief Propagation (BP) and the Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm, respectively. However, the BCJR algorithm scales exponentially with the number of nois copies, making reconstruction from more than about four copies infeasible. In this paper, we introduce BCJRFormer, a transformer-based neural inner decoder for marker codes. BCJRFormer achieves error rates comparable to the BCJR algorithm for single-message transmissions while scaling only quadratically with the number of noisy copies. This makes BCJRFormer well suited for DNA data storage, where multiple reads of the same DNA sequence are common. To further reduce the bit error rate, we replace the BP outer decoder with a transformer-based decoder. Combined, this results in a performant and efficient end-to-end transformer-based pipeline for decoding multiple noisy copies corrupted by insertion, deletion, and substitution errors.

inproceedings


ISIT 2025

IEEE International Symposium on Information Theory. Ann Arbor, MI, USA, Jul 22-27, 2025.

Authors

J. Streit • F. WeindelR. Heckel

Links

DOI

Research Area

 A2 | Mathematical Foundations

BibTeXKey: SWH25

Back to Top