DeepZebraFish

Contents

DeepZebraFish#

The DeepZebraFish model is a peak regression model trained on a scATAC-seq dataset of the developing zebrafish embryo. This dataset comprises 20 developmental stages, and 639 cell type-timepoint-combinations that were used as separate classes for the model.

The model was trained on a set of 793K consensus peaks and fine-tuned on 89K cell type-timepoint-specific peaks.

The model is a CNN multiclass regression model using the dilated_cnn() architecture.

Details of the data and the model can be found in the original publication.


Citation

Kempynck, N., De Winter, S., et al. CREsted: modeling genomic and synthetic cell type-specific enhancers across tissues and species. bioRxiv (2025). https://doi.org/10.1101/2025.04.02.646812

Data source

Sun, K., Liu, X., et al. Mapping the chromatin accessibility landscape of zebrafish embryogenesis at single-cell resolution by SPATAC-seq. Nature Cell Biology (2024). https://doi.org/10.1038/s41556-024-01449-0

Usage#

 1import crested
 2import keras
 3
 4# download model
 5model_path, output_names = crested.get_model("DeepZebraFish")
 6
 7# load model
 8model = keras.models./load_model(model_path, compile=False)
 9
10# make predictions
11sequence = "A" * 2114
12predictions = crested.tl.predict(sequence, model)
13print(predictions.shape)