Home  | Publications | HKG+24a

Flashzoi: An Enhanced Borzoi Model for Accelerated Genomic Analysis

MCML Authors

Abstract

Accurately predicting how DNA sequence drives gene regulation and how genetic variants alter gene expression is a central challenge in genomics. Borzoi, which models over ten thousand genomic assays including RNA-seq coverage from over half a megabase of sequence context alone promises to become an important foundation model in regulatory genomics, both for massively annotating variants and for further model development. However, its reliance on handcrafted, relative positional encodings within the transformer architecture limits its computational efficiency. Here we present Flashzoi, an enhanced Borzoi model that leverages rotary positional encodings and FlashAttention-2. This achieves over 3-fold faster training and inference and up to 2.4-fold reduced memory usage, while maintaining or improving accuracy in modeling various genomic assays including RNA-seq coverage, predicting variant effects, and enhancer-promoter linking. Flashzoi{textquoteright}s improved efficiency facilitates large-scale genomic analyses and opens avenues for exploring more complex regulatory mechanisms and modeling.Competing Interest StatementThe authors have declared no competing interest.

misc


Preprint

Dec. 2024

Authors

J. HingerlA. KarollusJ. Gagneur

Links

DOI

Research Area

 C2 | Biology

BibTeXKey: HKG+24a

Back to Top