Home  | Publications | FDK+21

Sfaira Accelerates Data and Model Reuse in Single Cell Genomics

MCML Authors

Link to Profile Fabian Theis PI Matchmaking

Fabian Theis

Prof. Dr.

Principal Investigator

Abstract

Single-cell RNA-seq datasets are often first analyzed independently without harnessing model fits from previous studies, and are then contextualized with public data sets, requiring time-consuming data wrangling. We address these issues with sfaira, a single-cell data zoo for public data sets paired with a model zoo for executable pre-trained models. The data zoo is designed to facilitate contribution of data sets using ontologies for metadata. We propose an adaption of cross-entropy loss for cell type classification tailored to datasets annotated at different levels of coarseness. We demonstrate the utility of sfaira by training models across anatomic data partitions on 8 million cells.

article


Genome Biology

22.248. Aug. 2021.
Top Journal

Authors

D. S. Fischer • L. Dony • M. König • A. Moeed • L. Zappia • L. Heumos • S. Tritschler • O. Holmberg • H. Aliee • F. J. Theis

Links

DOI

Research Area

 C2 | Biology

BibTeXKey: FDK+21

Back to Top