Enhancer code analysis#

In this notebook we will go over how to obtain cell type characteristic sequence patterns from CREsted models, or any other model, using CREsted functionality and the extra packages from crested[motif]: tfmodisco-lite and memesuite-lite.

Obtaining contribution scores per model class and running tfmodisco-lite#

Before we can do any analysis, we need to calculate the contribution scores for cell type-specific regions. These regions hold the most sequence information relevant for cell type identity. From those, we can run tfmodisco-lite.

import anndata as ad
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
import crested
import keras
import pickle

Load data and CREsted model#

We’ll load the dataset and model from the the introduction notebook:

# Set the genome
genome = crested.Genome("mm10/genome.fa", "mm10/genome.chrom.sizes")
crested.register_genome(genome)  # Register the genome so that it's automatically used in every function

2026-02-20T13:22:01.718526+0100 INFO Genome genome registered.

# Read in the anndata
adata = ad.read_h5ad("crested/mouse_cortex.h5ad")

# Load a trained model
model_path = "mouse_biccn/finetuned_model/checkpoints/02.keras" # change to your model path
model = keras.models.load_model(model_path, compile=False)

Select the most informative regions per cell type#

To obtain cell type-characteristic patterns, we need to calculate contribution scores on highly specific regions. For this purpose, we’ve included a preprocessing function crested.pp.sort_and_filter_regions_on_specificity() to keep the top k most specific regions per cell type that you can use to filter your data before running modisco.

There are three options to select the top regions: either purely based on peak height, purely based on predictions, or on their combination. Here we show how to use the combination of both (which we recommend, see Johansen & Kempynck et al., 2025).

# Store predictions for all our regions in the anndata object
predictions = crested.tl.predict(adata, model)
adata.layers["biccn_model"] = predictions.T

2026-02-18T22:04:41.666517+0100 INFO Lazily importing module crested.tl. This could take a second...
   7/4274 ━━━━━━━━━━━━━━━━━━━━ 1:55 27ms/step
4272/4274 ━━━━━━━━━━━━━━━━━━━━ 0s 28ms/step
4274/4274 ━━━━━━━━━━━━━━━━━━━━ 145s 29ms/step

# Calculate the average of the ground truth and predictions
adata.layers['combined'] = (adata.X + adata.layers["biccn_model"])/2

We can use crested.pl.qc.sort_and_filter_cutoff() to look at the gini score distributions for the different regions per class.

Here, we take the top 2000 most specific regions, but top 500 or 1000 regions would give similar results.

%matplotlib inline
crested.pl.qc.sort_and_filter_cutoff(adata, model_name="combined", cutoffs=[500, 1000, 2000], max_k=5000)

2026-02-18T22:07:53.298189+0100 INFO Lazily importing module crested.pl. This could take a second...

../_images/cd95f5b3d5b788d4855387cc21af73b6fb564a1e7649742e26b8bd2f16e3c812.png

# Filter most informative regions per class
top_k = 2000
adata_filtered = crested.pp.sort_and_filter_regions_on_specificity(adata, model_name="combined", top_k=top_k, method="gini", inplace=False)
adata_filtered

2026-02-18T22:09:56.457403+0100 INFO After sorting and filtering, kept 38000 regions.

AnnData object with n_obs × n_vars = 19 × 38000
    obs: 'file_path'
    var: 'chr', 'start', 'end', 'target_start', 'target_end', 'split', 'Class name', 'rank', 'gini_score'
    obsm: 'weights'
    layers: 'biccn_model', 'combined'

adata_filtered.var

	chr	start	end	target_start	target_end	split	Class name	rank	gini_score
region
chr10:45499432-45501546	chr10	45499432	45501546	45499989	45500989	val	Astro	1	0.830273
chr5:9868290-9870404	chr5	9868290	9870404	9868847	9869847	train	Astro	2	0.819487
chr2:65274604-65276718	chr2	65274604	65276718	65275161	65276161	train	Astro	3	0.818493
chrX:23135863-23137977	chrX	23135863	23137977	23136420	23137420	train	Astro	4	0.816922
chr3:115410072-115412186	chr3	115410072	115412186	115410629	115411629	train	Astro	5	0.815808
...	...	...	...	...	...	...	...	...	...
chr3:49982863-49984977	chr3	49982863	49984977	49983420	49984420	train	Vip	1996	0.547537
chr5:97165626-97167740	chr5	97165626	97167740	97166183	97167183	train	Vip	1997	0.547428
chr1:19158560-19160674	chr1	19158560	19160674	19159117	19160117	train	Vip	1998	0.547314
chr19:12142425-12144539	chr19	12142425	12144539	12142982	12143982	train	Vip	1999	0.547256
chr17:49808507-49810621	chr17	49808507	49810621	49809064	49810064	train	Vip	2000	0.547255

38000 rows × 9 columns

Calculating contribution scores per class#

Now you can calculate the contribution scores for all the regions in your filtered AnnData.
By default, the contribution scores are calculated using the expected integrated gradients method, but you can change this to simple integrated gradients to speed up the calculation. We’ve found anecdotally that this has a very minor effect on the quality of the contribution scores, while speeding up the calculation significantly.

crested.tl.contribution_scores_specific(
    input=adata_filtered,
    target_idx=None,  # We calculate for all classes
    model=model,
    output_dir="modisco_results_ft_2000",
    method="integrated_grad"
)

Show code cell output

Hide code cell output

2026-02-18T22:09:57.090369+0100 INFO Calculating contribution scores for 1 class(es) and 2000 region(s).
2026-02-18T22:12:03.960105+0100 INFO Calculating contribution scores for 1 class(es) and 2000 region(s).
2026-02-18T22:14:02.281014+0100 INFO Calculating contribution scores for 1 class(es) and 2000 region(s).
2026-02-18T22:16:00.581708+0100 INFO Calculating contribution scores for 1 class(es) and 2000 region(s).
2026-02-18T22:17:59.153553+0100 INFO Calculating contribution scores for 1 class(es) and 2000 region(s).
2026-02-18T22:19:57.538069+0100 INFO Calculating contribution scores for 1 class(es) and 2000 region(s).
2026-02-18T22:21:55.870284+0100 INFO Calculating contribution scores for 1 class(es) and 2000 region(s).
2026-02-18T22:25:52.376153+0100 INFO Calculating contribution scores for 1 class(es) and 2000 region(s).
2026-02-18T22:27:50.404973+0100 INFO Calculating contribution scores for 1 class(es) and 2000 region(s).
2026-02-18T22:29:48.763463+0100 INFO Calculating contribution scores for 1 class(es) and 2000 region(s).
2026-02-18T22:31:46.930498+0100 INFO Calculating contribution scores for 1 class(es) and 2000 region(s).
2026-02-18T22:33:46.036175+0100 INFO Calculating contribution scores for 1 class(es) and 2000 region(s).
2026-02-18T22:35:44.168858+0100 INFO Calculating contribution scores for 1 class(es) and 2000 region(s).
2026-02-18T22:37:42.562841+0100 INFO Calculating contribution scores for 1 class(es) and 2000 region(s).
2026-02-18T22:39:40.721231+0100 INFO Calculating contribution scores for 1 class(es) and 2000 region(s).
2026-02-18T22:41:39.174031+0100 INFO Calculating contribution scores for 1 class(es) and 2000 region(s).
2026-02-18T22:43:37.742379+0100 INFO Calculating contribution scores for 1 class(es) and 2000 region(s).
2026-02-18T22:45:35.769823+0100 INFO Calculating contribution scores for 1 class(es) and 2000 region(s).

(array([[[[-1.48615626e-07, -5.54927112e-07,  4.07066267e-07, ...,
           -1.79133679e-07,  7.91571267e-07, -9.54171497e-08],
          [ 2.84154055e-07,  3.62958559e-07,  6.59639511e-07, ...,
            1.65198423e-06,  6.93648587e-07,  6.85019188e-07],
          [ 1.58720212e-07,  1.27419142e-07,  2.07420385e-07, ...,
           -7.57875739e-07, -7.95578956e-07, -2.73807672e-07],
          [-2.04508495e-07,  3.26645903e-07, -9.35306844e-07, ...,
           -9.60897182e-07,  3.23470566e-08, -3.14376678e-07]]],
 
 
        [[[ 1.38958785e-06, -1.30966725e-06, -5.06607375e-06, ...,
           -1.16397632e-05, -5.93204959e-06, -3.62574792e-06],
          [-3.17109595e-07,  2.51010647e-07, -3.71272307e-07, ...,
            1.86718694e-06,  2.69274597e-06,  4.92592653e-06],
          [ 2.27687622e-07, -7.58866179e-07,  1.47926028e-06, ...,
            3.09061252e-05,  1.20547866e-05, -5.71382179e-06],
          [-1.85486203e-06,  3.86895226e-06,  4.91772153e-06, ...,
           -1.04782321e-05, -5.92017886e-06,  3.52403322e-06]]],
 
 
        [[[-7.97643736e-07, -3.85100606e-07, -9.47989420e-08, ...,
            4.23821547e-07,  5.40903329e-07, -9.31905106e-07],
          [ 9.30548481e-07,  2.43477984e-06, -4.68601691e-08, ...,
           -1.04987203e-06, -1.52894552e-06, -8.91138612e-08],
          [-4.05994570e-07, -1.15031787e-06, -4.41645710e-07, ...,
            1.60890374e-06,  5.07899529e-07,  1.45172123e-07],
          [ 6.96884001e-07, -8.35631113e-07,  1.02211168e-07, ...,
           -4.88856870e-07,  3.50123827e-07,  7.61683054e-07]]],
 
 
        ...,
 
 
        [[[-1.57228055e-06,  3.17765525e-09,  2.14368313e-07, ...,
            1.71230147e-06,  4.87835405e-06,  1.38985786e-06],
          [ 2.29700549e-06, -2.85101783e-06,  1.34790946e-06, ...,
           -4.60111278e-06, -2.89423065e-06, -2.40853046e-06],
          [-1.60753962e-06,  7.44850468e-07, -2.92837126e-06, ...,
            1.70980388e-06, -6.49964363e-07,  1.65252720e-06],
          [ 6.95323479e-07,  2.22642188e-06,  1.17253705e-06, ...,
            2.39325936e-06, -2.82814631e-06, -4.49548168e-07]]],
 
 
        [[[ 1.07652374e-06, -2.11098222e-06, -4.70268617e-07, ...,
           -8.07893775e-06, -5.11730605e-06, -3.85602380e-06],
          [-1.14714965e-06,  6.61964975e-07, -1.61520995e-06, ...,
            3.53050041e-06,  2.91305514e-06,  2.93008748e-06],
          [ 3.23378771e-07,  2.05736205e-06,  2.07632320e-06, ...,
            6.25897792e-06, -7.90520403e-07,  3.05838034e-06],
          [-1.81226682e-07, -1.31726586e-06, -1.25529581e-07, ...,
           -8.64686444e-06,  2.35491757e-06, -2.12834334e-06]]],
 
 
        [[[-9.90992021e-07,  3.45539547e-06, -7.99445843e-06, ...,
           -1.64329231e-06,  1.53056931e-06,  3.08550898e-06],
          [-5.81575932e-07, -4.70697023e-06,  6.51809933e-06, ...,
            3.41147097e-06, -6.72604438e-06,  1.11118268e-07],
          [-5.77494689e-07,  3.27210250e-06, -2.31046465e-06, ...,
            2.65483209e-06,  3.99918690e-06,  4.32558656e-09],
          [ 1.26465181e-07, -2.15214413e-06,  1.48661877e-06, ...,
           -8.24616927e-06, -1.60217598e-06, -4.17409910e-06]]]]),
 array([[[0., 0., 1., ..., 0., 0., 0.],
         [0., 1., 0., ..., 1., 0., 1.],
         [0., 0., 0., ..., 0., 0., 0.],
         [1., 0., 0., ..., 0., 1., 0.]],
 
        [[1., 0., 0., ..., 0., 1., 1.],
         [0., 1., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 1., 0., 0.],
         [0., 0., 1., ..., 0., 0., 0.]],
 
        [[1., 0., 0., ..., 0., 0., 1.],
         [0., 0., 0., ..., 1., 1., 0.],
         [0., 0., 1., ..., 0., 0., 0.],
         [0., 1., 0., ..., 0., 0., 0.]],
 
        ...,
 
        [[1., 0., 0., ..., 1., 0., 1.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 1., 0., ..., 0., 1., 0.],
         [0., 0., 1., ..., 0., 0., 0.]],
 
        [[0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 1., 0.],
         [1., 1., 0., ..., 0., 0., 0.],
         [0., 0., 1., ..., 1., 0., 1.]],
 
        [[1., 0., 1., ..., 1., 0., 0.],
         [0., 0., 0., ..., 0., 1., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 1., 0., ..., 0., 0., 1.]]], dtype=float32))

Running tfmodisco-lite#

When this is done, you can run TF-MoDISco-lite on the saved contribution scores to find motifs that are important for the classification/regression task.

You could use the tfmodisco package directly to do this, or you could use the crested.tl.modisco.tfmodisco() function which is essentially a wrapper around the tfmodisco package.

Note that from here on, you can use contribution scores from any model trained in any framework, as this analysis just requires a set of one hot encoded sequences and contribution scores per cell type stored in the same directory.

If you don’t have tomtom available on the command line, set report=False in crested.tl.modisco.tfmodisco(). You won’t get an html report matching the motifs to their closest match in the motif database, but you’ll still get a collection of clustered seqlets for use in the rest of the notebook.

meme_db, motif_to_tf_file = crested.get_motif_db()

# run tfmodisco on the contribution scores
crested.tl.modisco.tfmodisco(
    window=1000,
    output_dir="modisco_results_ft_2000",
    contrib_dir="modisco_results_ft_2000",
    report=False,  # Optional, will match patterns to motif MEME database - deactivate if you don't have TOMTOM read on the command line
    meme_db=meme_db,  # File to MEME database - not needed if not generating reports
    max_seqlets=20000,
)

2026-02-18T22:48:10.031481+0100 INFO No class names provided, using all found in the contribution directory: ['Pvalb', 'Lamp5', 'L5ET', 'Oligo', 'L6IT', 'L2_3IT', 'Endo', 'Sst', 'L5IT', 'Sncg', 'Astro', 'VLMC', 'OPC', 'L6CT', 'Vip', 'L5_6NP', 'L6b', 'Micro_PVM', 'SstChodl']
2026-02-18T22:48:10.527970+0100 INFO Running modisco for class: Pvalb
Using 11070 positive seqlets
Extracted 1602 negative seqlets
2026-02-18T22:55:24.463964+0100 INFO Running modisco for class: Lamp5
Using 11127 positive seqlets
Extracted 1617 negative seqlets
2026-02-18T23:02:18.775590+0100 INFO Running modisco for class: L5ET
Using 12954 positive seqlets
Extracted 1638 negative seqlets
2026-02-18T23:10:50.207266+0100 INFO Running modisco for class: Oligo
Using 13245 positive seqlets
Extracted 3037 negative seqlets
2026-02-18T23:20:09.298092+0100 INFO Running modisco for class: L6IT
Using 15015 positive seqlets
2026-02-18T23:29:28.185590+0100 INFO Running modisco for class: L2_3IT
Using 15115 positive seqlets
Extracted 2734 negative seqlets
2026-02-18T23:39:27.206960+0100 INFO Running modisco for class: Endo
Using 7876 positive seqlets
Extracted 1529 negative seqlets
2026-02-18T23:44:16.357842+0100 INFO Running modisco for class: Sst
Using 13196 positive seqlets
Extracted 2422 negative seqlets
2026-02-18T23:53:55.514966+0100 INFO Running modisco for class: L5IT
Using 14035 positive seqlets
Extracted 2655 negative seqlets
2026-02-19T00:04:25.113868+0100 INFO Running modisco for class: Sncg
Using 11203 positive seqlets

Analysis of cell-type specific sequence patterns#

Once you have obtained your modisco results, you can plot the the found patterns using the crested.pl.modisco.modisco_results() function.

%matplotlib inline
top_k = 1000
crested.pl.modisco.modisco_results(
    classes=["Astro", "L5ET", "Vip", "Oligo"],
    contribution="positive",
    contribution_dir="modisco_results_ft_2000",
    num_seq=top_k,
    y_max=0.15,
    viz="pwm",
)  # You can also visualize in 'pwm' format

2026-02-19T10:17:00.047868+0100 INFO Starting genomic contributions plot for classes: ['Astro', 'L5ET', 'Vip', 'Oligo']

../_images/26a75899b7fce3ec8cab984f287cc4723b1be47d056dfb5d1b76d2c24b32a522.png

Matching patterns across cell types#

Since we have calculated per cell type the patterns independently of each other, we do not know quantitatively how and if they overlap. It can be interesting to get an overview of which patterns are found across multiple cell types, how important they are, and if there are unique patterns only found in a small selection of classes. Therefore, we have made a pattern clustering algorithm, which starts from the results of tfmodisco-lite, and returns a pattern matrix, which contains the importance of the clustered patterns per cell type, and a pattern dictionary, describing all clustered patterns.

First, we’ll obtain the modisco files per class by using crested.tl.modisco.match_h5_files_to_classes() using our selected classes.

# First we obtain the resulting modisco files per class
matched_files = crested.tl.modisco.match_h5_files_to_classes(
    contribution_dir="modisco_results_ft_2000", classes=list(adata.obs_names)
)

2026-02-19T11:48:01.680547+0100 INFO Lazily importing module crested.tl. This could take a second...

A quick pairwise comparison#

Before we do any pattern clustering, we can check for each independent pattern how similar it is to all the other patterns using memesuite-lite. We use crested.tl.modisco.calculate_tomtom_similarity_per_pattern(). This function returns a pairwise similarity matrix of every unique pattern, together with a list of ids and a dictionary containing additional information per pattern.

sim_matrix, pattern_ids, pattern_dict = crested.tl.modisco.calculate_tomtom_similarity_per_pattern(
    matched_files=matched_files, trim_ic_threshold=0.025, verbose=True
)

Reading file modisco_results_ft_2000/Astro_modisco_results.h5
Reading file modisco_results_ft_2000/Endo_modisco_results.h5
Reading file modisco_results_ft_2000/L2_3IT_modisco_results.h5
Reading file modisco_results_ft_2000/L5ET_modisco_results.h5
Reading file modisco_results_ft_2000/L5IT_modisco_results.h5
Reading file modisco_results_ft_2000/L5_6NP_modisco_results.h5
Reading file modisco_results_ft_2000/L6CT_modisco_results.h5
Reading file modisco_results_ft_2000/L6IT_modisco_results.h5
Reading file modisco_results_ft_2000/L6b_modisco_results.h5
Reading file modisco_results_ft_2000/Lamp5_modisco_results.h5
Reading file modisco_results_ft_2000/Micro_PVM_modisco_results.h5
Reading file modisco_results_ft_2000/OPC_modisco_results.h5
Reading file modisco_results_ft_2000/Oligo_modisco_results.h5
Reading file modisco_results_ft_2000/Pvalb_modisco_results.h5
Reading file modisco_results_ft_2000/Sncg_modisco_results.h5
Reading file modisco_results_ft_2000/Sst_modisco_results.h5
Reading file modisco_results_ft_2000/SstChodl_modisco_results.h5
Reading file modisco_results_ft_2000/VLMC_modisco_results.h5
Reading file modisco_results_ft_2000/Vip_modisco_results.h5
Total patterns: 548

We can now visualize this similarities using crested.pl.modisco.clustermap_tomtom_similarities(). We will subset the often large similarity matrix to a relevant subset. First, we will look at a pattern and find other patterns similar to it in other cell types. We add some additional group labels for visualization purposes.

nn = {"Astro", "Endo", "Micro_PVM", "OPC", "Oligo", "VLMC"}
exc = {"L2_3IT", "L5ET", "L5IT", "L5_6NP", "L6CT", "L6IT", "L6b"}
inh = {"Lamp5", "Pvalb", "Sncg", "Sst", "SstChodl", "Vip"}

groups, groups_2 = [], []
for id in pattern_ids:
    ct = "_".join(id.split("_")[:-3])
    groups.append(
        "Non-neuronal"
        if ct in nn
        else "Excitatory"
        if ct in exc
        else "Inhibitory"
        if ct in inh
        else (_ for _ in ()).throw(ValueError(f"Unknown class: {ct}"))
    )
    groups_2.append(ct)

unique_cats = pd.unique(groups_2)
group_colors_2 = {cat: mcolors.to_hex(plt.get_cmap("tab20", len(unique_cats))(i)) for i, cat in enumerate(unique_cats)}
group_colors = {"Non-neuronal": "skyblue", "Excitatory": "salmon", "Inhibitory": "green"}

%matplotlib inline
crested.pl.modisco.clustermap_tomtom_similarities(
    sim_matrix=sim_matrix,
    ids=pattern_ids,
    pattern_dict=pattern_dict,
    group_info=[(groups, group_colors), (groups_2, group_colors_2)],  # Grouping labels
    query_id="Vip_pos_patterns_1",  # Find patterns similar to this one
    threshold=3,  # TOMTOM similarity threshold, we take the -log10(pval)
    min_seqlets=100,  # Add a minimum amount of seqlets to take the most relevant patterns
)

2026-02-19T11:49:10.404031+0100 INFO Lazily importing module crested.pl. This could take a second...

../_images/7d348dd34ef5b92dc6ba12aa242b98c3ea2aad4e5d3a7af3e70b7119189818d2.png

We can also use this to look at all the different patterns in a subset of cell types, and see if we can find interesting groups of similar motifs.

crested.pl.modisco.clustermap_tomtom_similarities(
    sim_matrix=sim_matrix,
    ids=pattern_ids,
    pattern_dict=pattern_dict,
    group_info=[(groups, group_colors), (groups_2, group_colors_2)],  # Grouping labels
    class_names=["Lamp5", "Pvalb", "Sncg", "Sst", "SstChodl", "Vip"],  # Subset of classes to use patterns from
    min_seqlets=300,  # Add a minimum amount of seqlets to take the most relevant patterns
)

<seaborn.matrix.ClusterGrid at 0x145cc3bbcd90>

../_images/1767dfb3500d4574cea1fb017601edcddb15348c3d55450974516f65fdf12894.png

Pattern clustering across cell types#

Since many patterns are similar and can be matched across cell types, we can also cluster them into groups and compare the importance of that matched motif over all cell types. We cluster all separate patterns using crested.tl.modisco.process_patterns() and create a pattern matrix with crested.tl.modisco.create_pattern_matrix().

# Then we cluster matching patterns, and define a pattern matrix [#classes, #patterns] describing their importance
all_patterns = crested.tl.modisco.process_patterns(
    matched_files,
    sim_threshold=4.25,  # The similarity threshold used for matching patterns. We take the -log10(pval), pval obtained through TOMTOM matching from memesuite-lite
    trim_ic_threshold=0.05,  # Information content (IC) threshold on which to trim patterns
    discard_ic_threshold=0.2,  # IC threshold used for discarding single instance patterns
    verbose=True,  # Useful for doing sanity checks on matching patterns
)
pattern_matrix = crested.tl.modisco.create_pattern_matrix(
    classes=list(adata.obs_names),
    all_patterns=all_patterns,
    normalize=False,
    pattern_parameter="seqlet_count_log",
)
pattern_matrix.shape

Show code cell output

Hide code cell output

Reading file modisco_results_ft_2000/Astro_modisco_results.h5
Match between Astro_pos_patterns_9 and Astro_pos_patterns_4 with similarity score 5.68
Match between Astro_pos_patterns_16 and Astro_pos_patterns_2 with similarity score 8.06
Match between Astro_pos_patterns_18 and Astro_pos_patterns_12 with similarity score 6.32
Reading file modisco_results_ft_2000/Endo_modisco_results.h5
Match between Endo_neg_patterns_0 and Astro_neg_patterns_2 with similarity score 5.24
Match between Endo_neg_patterns_7 and Endo_neg_patterns_6 with similarity score 4.81
Match between Endo_neg_patterns_8 and Astro_pos_patterns_8 with similarity score 4.90
Match between Endo_neg_patterns_9 and Astro_pos_patterns_13 with similarity score 4.31
Match between Endo_neg_patterns_14 and Endo_neg_patterns_7 with similarity score 5.25
Match between Endo_pos_patterns_1 and Endo_neg_patterns_10 with similarity score 4.98
Match between Endo_pos_patterns_6 and Endo_pos_patterns_3 with similarity score 5.09
Match between Endo_pos_patterns_8 and Astro_pos_patterns_0 with similarity score 4.62
Match between Endo_pos_patterns_17 and Astro_neg_patterns_1 with similarity score 4.61
Match between Endo_pos_patterns_19 and Endo_pos_patterns_12 with similarity score 4.40
Match between Endo_pos_patterns_21 and Endo_pos_patterns_1 with similarity score 5.15
Reading file modisco_results_ft_2000/L2_3IT_modisco_results.h5
Match between L2_3IT_neg_patterns_4 and Endo_pos_patterns_9 with similarity score 5.00
Match between L2_3IT_neg_patterns_5 and Endo_neg_patterns_3 with similarity score 4.29
Match between L2_3IT_neg_patterns_11 and Endo_pos_patterns_22 with similarity score 6.01
Match between L2_3IT_neg_patterns_12 and Astro_neg_patterns_1 with similarity score 4.44
Match between L2_3IT_neg_patterns_15 and L2_3IT_neg_patterns_13 with similarity score 4.66
Match between L2_3IT_neg_patterns_17 and L2_3IT_neg_patterns_13 with similarity score 5.09
Match between L2_3IT_pos_patterns_0 and L2_3IT_neg_patterns_10 with similarity score 4.60
Match between L2_3IT_pos_patterns_1 and Endo_pos_patterns_5 with similarity score 5.81
Match between L2_3IT_pos_patterns_2 and Endo_pos_patterns_20 with similarity score 7.10
Match between L2_3IT_pos_patterns_6 and Astro_pos_patterns_0 with similarity score 5.09
Match between L2_3IT_pos_patterns_10 and Astro_pos_patterns_10 with similarity score 6.35
Match between L2_3IT_pos_patterns_11 and Astro_pos_patterns_11 with similarity score 6.21
Match between L2_3IT_pos_patterns_13 and Astro_pos_patterns_0 with similarity score 5.57
Match between L2_3IT_pos_patterns_15 and Endo_pos_patterns_24 with similarity score 10.84
Match between L2_3IT_pos_patterns_16 and L2_3IT_neg_patterns_7 with similarity score 5.11
Match between L2_3IT_pos_patterns_17 and L2_3IT_pos_patterns_16 with similarity score 7.20
Match between L2_3IT_pos_patterns_18 and L2_3IT_pos_patterns_17 with similarity score 6.15
Match between L2_3IT_pos_patterns_20 and L2_3IT_pos_patterns_17 with similarity score 5.71
Reading file modisco_results_ft_2000/L5ET_modisco_results.h5
Match between L5ET_neg_patterns_0 and Astro_neg_patterns_0 with similarity score 5.57
Match between L5ET_neg_patterns_1 and Astro_pos_patterns_19 with similarity score 4.83
Match between L5ET_neg_patterns_7 and Astro_neg_patterns_1 with similarity score 5.03
Match between L5ET_neg_patterns_9 and L2_3IT_neg_patterns_11 with similarity score 6.99
Match between L5ET_pos_patterns_0 and Astro_pos_patterns_0 with similarity score 4.97
Match between L5ET_pos_patterns_2 and L2_3IT_pos_patterns_17 with similarity score 7.46
Match between L5ET_pos_patterns_3 and L2_3IT_pos_patterns_11 with similarity score 6.80
Match between L5ET_pos_patterns_4 and L2_3IT_pos_patterns_5 with similarity score 7.17
Match between L5ET_pos_patterns_5 and L2_3IT_pos_patterns_3 with similarity score 5.89
Match between L5ET_pos_patterns_6 and L2_3IT_pos_patterns_0 with similarity score 7.25
Match between L5ET_pos_patterns_8 and L2_3IT_pos_patterns_4 with similarity score 4.84
Match between L5ET_pos_patterns_9 and L2_3IT_pos_patterns_2 with similarity score 5.92
Match between L5ET_pos_patterns_12 and L2_3IT_pos_patterns_14 with similarity score 4.72
Match between L5ET_pos_patterns_13 and L5ET_pos_patterns_2 with similarity score 6.68
Match between L5ET_pos_patterns_16 and L5ET_pos_patterns_5 with similarity score 5.94
Reading file modisco_results_ft_2000/L5IT_modisco_results.h5
Match between L5IT_neg_patterns_2 and L2_3IT_neg_patterns_11 with similarity score 8.11
Match between L5IT_neg_patterns_5 and Endo_neg_patterns_3 with similarity score 4.60
Match between L5IT_neg_patterns_6 and L2_3IT_neg_patterns_4 with similarity score 6.17
Match between L5IT_neg_patterns_7 and L2_3IT_neg_patterns_13 with similarity score 4.48
Match between L5IT_neg_patterns_9 and L2_3IT_neg_patterns_19 with similarity score 5.97
Match between L5IT_neg_patterns_10 and L2_3IT_neg_patterns_9 with similarity score 7.09
Match between L5IT_neg_patterns_12 and Astro_neg_patterns_1 with similarity score 4.61
Match between L5IT_neg_patterns_14 and L5IT_neg_patterns_9 with similarity score 4.81
Match between L5IT_neg_patterns_15 and L5IT_neg_patterns_2 with similarity score 6.57
Match between L5IT_neg_patterns_16 and Astro_neg_patterns_3 with similarity score 5.60
Match between L5IT_pos_patterns_0 and L5ET_pos_patterns_2 with similarity score 9.32
Match between L5IT_pos_patterns_1 and L5ET_pos_patterns_5 with similarity score 8.90
Match between L5IT_pos_patterns_2 and L5ET_pos_patterns_6 with similarity score 8.73
Match between L5IT_pos_patterns_4 and Astro_pos_patterns_3 with similarity score 9.17
Match between L5IT_pos_patterns_5 and L5ET_pos_patterns_9 with similarity score 7.17
Match between L5IT_pos_patterns_6 and L5ET_pos_patterns_0 with similarity score 5.13
Match between L5IT_pos_patterns_7 and L2_3IT_pos_patterns_5 with similarity score 6.86
Match between L5IT_pos_patterns_8 and L5ET_pos_patterns_8 with similarity score 4.36
Match between L5IT_pos_patterns_11 and L2_3IT_pos_patterns_8 with similarity score 4.69
Match between L5IT_pos_patterns_13 and L5ET_pos_patterns_2 with similarity score 4.91
Match between L5IT_pos_patterns_14 and L5ET_pos_patterns_6 with similarity score 6.24
Match between L5IT_pos_patterns_15 and L2_3IT_pos_patterns_7 with similarity score 13.52
Match between L5IT_pos_patterns_16 and L5ET_pos_patterns_2 with similarity score 6.71
Match between L5IT_pos_patterns_17 and L5ET_pos_patterns_6 with similarity score 7.29
Match between L5IT_pos_patterns_18 and L5ET_pos_patterns_3 with similarity score 5.68
Match between L5IT_pos_patterns_19 and L5ET_pos_patterns_2 with similarity score 5.19
Match between L5IT_pos_patterns_22 and L5IT_pos_patterns_3 with similarity score 7.10
Match between L5IT_pos_patterns_23 and L5IT_pos_patterns_21 with similarity score 7.44
Match between L5IT_pos_patterns_24 and L5ET_pos_patterns_2 with similarity score 4.77
Reading file modisco_results_ft_2000/L5_6NP_modisco_results.h5
Match between L5_6NP_neg_patterns_0 and L5ET_neg_patterns_2 with similarity score 4.67
Match between L5_6NP_neg_patterns_1 and L5IT_neg_patterns_3 with similarity score 6.92
Match between L5_6NP_neg_patterns_4 and L5ET_pos_patterns_2 with similarity score 7.49
Match between L5_6NP_neg_patterns_5 and L5IT_neg_patterns_2 with similarity score 4.57
Match between L5_6NP_neg_patterns_10 and L5_6NP_neg_patterns_1 with similarity score 5.56
Match between L5_6NP_neg_patterns_11 and L5_6NP_neg_patterns_6 with similarity score 5.14
Match between L5_6NP_pos_patterns_0 and L5IT_pos_patterns_1 with similarity score 8.37
Match between L5_6NP_pos_patterns_1 and L5ET_pos_patterns_2 with similarity score 10.80
Match between L5_6NP_pos_patterns_2 and L5_6NP_neg_patterns_8 with similarity score 5.29
Match between L5_6NP_pos_patterns_3 and L5ET_pos_patterns_9 with similarity score 5.76
Match between L5_6NP_pos_patterns_4 and Endo_pos_patterns_0 with similarity score 5.84
Match between L5_6NP_pos_patterns_5 and L5ET_pos_patterns_8 with similarity score 4.84
Match between L5_6NP_pos_patterns_6 and L2_3IT_pos_patterns_7 with similarity score 9.55
Match between L5_6NP_pos_patterns_7 and L5_6NP_pos_patterns_2 with similarity score 4.25
Match between L5_6NP_pos_patterns_8 and L2_3IT_pos_patterns_15 with similarity score 7.26
Match between L5_6NP_pos_patterns_10 and L5_6NP_pos_patterns_2 with similarity score 5.14
Reading file modisco_results_ft_2000/L6CT_modisco_results.h5
Match between L6CT_neg_patterns_1 and Astro_neg_patterns_0 with similarity score 4.97
Match between L6CT_neg_patterns_2 and L5ET_neg_patterns_1 with similarity score 5.70
Match between L6CT_neg_patterns_4 and L5_6NP_neg_patterns_13 with similarity score 5.20
Match between L6CT_neg_patterns_6 and Astro_neg_patterns_1 with similarity score 4.61
Match between L6CT_neg_patterns_7 and L2_3IT_neg_patterns_16 with similarity score 5.69
Match between L6CT_neg_patterns_9 and L5_6NP_neg_patterns_1 with similarity score 7.20
Match between L6CT_pos_patterns_0 and L5ET_pos_patterns_0 with similarity score 5.48
Match between L6CT_pos_patterns_1 and L5IT_pos_patterns_3 with similarity score 9.18
Match between L6CT_pos_patterns_2 and L6CT_neg_patterns_7 with similarity score 8.09
Match between L6CT_pos_patterns_3 and L5IT_pos_patterns_1 with similarity score 7.28
Match between L6CT_pos_patterns_4 and L5ET_pos_patterns_7 with similarity score 4.25
Match between L6CT_pos_patterns_5 and L5IT_pos_patterns_10 with similarity score 13.11
Match between L6CT_pos_patterns_6 and L5ET_pos_patterns_9 with similarity score 6.64
Match between L6CT_pos_patterns_7 and L5ET_pos_patterns_2 with similarity score 7.99
Match between L6CT_pos_patterns_8 and L5IT_pos_patterns_12 with similarity score 15.00
Match between L6CT_pos_patterns_9 and L5ET_pos_patterns_8 with similarity score 4.47
Match between L6CT_pos_patterns_10 and L2_3IT_pos_patterns_5 with similarity score 6.57
Match between L6CT_pos_patterns_12 and Astro_neg_patterns_3 with similarity score 4.99
Match between L6CT_pos_patterns_13 and L2_3IT_pos_patterns_7 with similarity score 13.15
Match between L6CT_pos_patterns_14 and L5IT_pos_patterns_1 with similarity score 5.45
Reading file modisco_results_ft_2000/L6IT_modisco_results.h5
Match between L6IT_neg_patterns_2 and L5ET_neg_patterns_5 with similarity score 6.75
Match between L6IT_neg_patterns_3 and L6CT_neg_patterns_10 with similarity score 4.60
Match between L6IT_neg_patterns_5 and L5_6NP_neg_patterns_1 with similarity score 4.72
Match between L6IT_neg_patterns_6 and Endo_pos_patterns_14 with similarity score 4.87
Match between L6IT_neg_patterns_8 and L5_6NP_neg_patterns_12 with similarity score 8.72
Match between L6IT_neg_patterns_10 and L6IT_neg_patterns_4 with similarity score 6.40
Match between L6IT_pos_patterns_0 and L6CT_pos_patterns_2 with similarity score 8.55
Match between L6IT_pos_patterns_1 and L6CT_pos_patterns_0 with similarity score 5.57
Match between L6IT_pos_patterns_2 and L5IT_pos_patterns_1 with similarity score 8.86
Match between L6IT_pos_patterns_3 and L2_3IT_pos_patterns_1 with similarity score 5.78
Match between L6IT_pos_patterns_4 and L2_3IT_pos_patterns_5 with similarity score 6.60
Match between L6IT_pos_patterns_5 and L5ET_pos_patterns_9 with similarity score 5.32
Match between L6IT_pos_patterns_6 and L6CT_pos_patterns_5 with similarity score 13.61
Match between L6IT_pos_patterns_7 and L5IT_pos_patterns_3 with similarity score 9.15
Match between L6IT_pos_patterns_10 and L2_3IT_pos_patterns_7 with similarity score 12.07
Match between L6IT_pos_patterns_12 and L6CT_pos_patterns_11 with similarity score 5.92
Match between L6IT_pos_patterns_14 and L6IT_pos_patterns_7 with similarity score 4.95
Match between L6IT_pos_patterns_15 and L6IT_pos_patterns_7 with similarity score 5.93
Reading file modisco_results_ft_2000/L6b_modisco_results.h5
Match between L6b_neg_patterns_1 and L5IT_neg_patterns_7 with similarity score 4.87
Match between L6b_neg_patterns_2 and Endo_neg_patterns_0 with similarity score 5.17
Match between L6b_neg_patterns_3 and L6IT_pos_patterns_1 with similarity score 4.81
Match between L6b_neg_patterns_4 and L5IT_neg_patterns_9 with similarity score 5.50
Match between L6b_neg_patterns_5 and L6IT_neg_patterns_5 with similarity score 8.63
Match between L6b_neg_patterns_8 and L2_3IT_neg_patterns_9 with similarity score 7.64
Match between L6b_neg_patterns_10 and L6b_neg_patterns_4 with similarity score 4.55
Match between L6b_neg_patterns_11 and L6IT_neg_patterns_2 with similarity score 8.08
Match between L6b_neg_patterns_12 and L6IT_neg_patterns_3 with similarity score 5.22
Match between L6b_pos_patterns_0 and L5_6NP_pos_patterns_2 with similarity score 5.80
Match between L6b_pos_patterns_1 and L6IT_pos_patterns_7 with similarity score 7.35
Match between L6b_pos_patterns_2 and L2_3IT_pos_patterns_1 with similarity score 7.58
Match between L6b_pos_patterns_3 and L5ET_pos_patterns_9 with similarity score 6.11
Match between L6b_pos_patterns_4 and L6IT_pos_patterns_11 with similarity score 4.99
Match between L6b_pos_patterns_5 and L5ET_pos_patterns_6 with similarity score 7.88
Match between L6b_pos_patterns_6 and L6IT_pos_patterns_6 with similarity score 12.75
Match between L6b_pos_patterns_8 and L6IT_pos_patterns_10 with similarity score 13.76
Match between L6b_pos_patterns_10 and L6IT_pos_patterns_4 with similarity score 5.21
Match between L6b_pos_patterns_11 and L6IT_pos_patterns_9 with similarity score 5.42
Reading file modisco_results_ft_2000/Lamp5_modisco_results.h5
Match between Lamp5_neg_patterns_1 and L5IT_pos_patterns_1 with similarity score 7.04
Match between Lamp5_neg_patterns_2 and L5ET_neg_patterns_13 with similarity score 6.76
Match between Lamp5_pos_patterns_0 and L6IT_pos_patterns_1 with similarity score 5.27
Match between Lamp5_pos_patterns_1 and Astro_pos_patterns_1 with similarity score 4.80
Match between Lamp5_pos_patterns_2 and L5ET_pos_patterns_2 with similarity score 7.76
Match between Lamp5_pos_patterns_3 and L5ET_pos_patterns_7 with similarity score 4.62
Match between Lamp5_pos_patterns_4 and Astro_pos_patterns_2 with similarity score 5.20
Match between Lamp5_pos_patterns_5 and L5ET_pos_patterns_9 with similarity score 9.00
Match between Lamp5_pos_patterns_6 and L6b_pos_patterns_11 with similarity score 5.79
Match between Lamp5_pos_patterns_8 and L6IT_pos_patterns_6 with similarity score 6.63
Match between Lamp5_pos_patterns_9 and L6IT_neg_patterns_2 with similarity score 4.59
Match between Lamp5_pos_patterns_11 and Lamp5_pos_patterns_8 with similarity score 4.82
Match between Lamp5_pos_patterns_13 and Astro_pos_patterns_3 with similarity score 6.00
Match between Lamp5_pos_patterns_15 and Lamp5_pos_patterns_4 with similarity score 6.40
Match between Lamp5_pos_patterns_16 and L6IT_pos_patterns_10 with similarity score 10.82
Reading file modisco_results_ft_2000/Micro_PVM_modisco_results.h5
Match between Micro_PVM_neg_patterns_0 and L6b_neg_patterns_13 with similarity score 5.78
Match between Micro_PVM_neg_patterns_2 and L6CT_pos_patterns_2 with similarity score 6.89
Match between Micro_PVM_neg_patterns_3 and L6IT_neg_patterns_5 with similarity score 5.55
Match between Micro_PVM_neg_patterns_5 and L5IT_neg_patterns_11 with similarity score 6.14
Match between Micro_PVM_neg_patterns_6 and L6IT_neg_patterns_4 with similarity score 4.37
Match between Micro_PVM_pos_patterns_0 and Micro_PVM_neg_patterns_1 with similarity score 8.74
Match between Micro_PVM_pos_patterns_8 and Lamp5_pos_patterns_2 with similarity score 9.56
Match between Micro_PVM_pos_patterns_16 and Micro_PVM_pos_patterns_9 with similarity score 4.26
Match between Micro_PVM_pos_patterns_18 and L5IT_pos_patterns_23 with similarity score 4.50
Match between Micro_PVM_pos_patterns_22 and Micro_PVM_pos_patterns_0 with similarity score 8.80
Reading file modisco_results_ft_2000/OPC_modisco_results.h5
Match between OPC_neg_patterns_0 and L6b_neg_patterns_2 with similarity score 4.87
Match between OPC_neg_patterns_2 and Micro_PVM_neg_patterns_6 with similarity score 6.06
Match between OPC_neg_patterns_3 and Micro_PVM_neg_patterns_6 with similarity score 5.06
Match between OPC_pos_patterns_1 and L6IT_pos_patterns_1 with similarity score 5.27
Match between OPC_pos_patterns_2 and OPC_pos_patterns_0 with similarity score 6.95
Match between OPC_pos_patterns_3 and Astro_pos_patterns_5 with similarity score 5.14
Match between OPC_pos_patterns_4 and Micro_PVM_pos_patterns_10 with similarity score 4.36
Match between OPC_pos_patterns_5 and Micro_PVM_pos_patterns_14 with similarity score 4.92
Match between OPC_pos_patterns_6 and Astro_pos_patterns_7 with similarity score 5.85
Match between OPC_pos_patterns_8 and Astro_pos_patterns_4 with similarity score 4.53
Match between OPC_pos_patterns_9 and L6IT_pos_patterns_10 with similarity score 8.83
Match between OPC_pos_patterns_10 and Micro_PVM_pos_patterns_4 with similarity score 6.21
Match between OPC_pos_patterns_12 and OPC_pos_patterns_5 with similarity score 4.98
Match between OPC_pos_patterns_13 and L6b_pos_patterns_0 with similarity score 5.57
Match between OPC_pos_patterns_15 and Endo_neg_patterns_9 with similarity score 5.63
Match between OPC_pos_patterns_18 and Astro_pos_patterns_5 with similarity score 4.41
Reading file modisco_results_ft_2000/Oligo_modisco_results.h5
Match between Oligo_neg_patterns_3 and Lamp5_neg_patterns_1 with similarity score 5.87
Match between Oligo_pos_patterns_0 and L6IT_neg_patterns_2 with similarity score 6.08
Match between Oligo_pos_patterns_1 and OPC_pos_patterns_2 with similarity score 11.39
Match between Oligo_pos_patterns_2 and Oligo_pos_patterns_0 with similarity score 6.17
Match between Oligo_pos_patterns_3 and OPC_pos_patterns_17 with similarity score 6.13
Match between Oligo_pos_patterns_4 and OPC_pos_patterns_4 with similarity score 4.42
Match between Oligo_pos_patterns_5 and Micro_PVM_pos_patterns_13 with similarity score 6.62
Match between Oligo_pos_patterns_6 and OPC_pos_patterns_16 with similarity score 6.01
Match between Oligo_pos_patterns_7 and Oligo_pos_patterns_0 with similarity score 9.00
Match between Oligo_pos_patterns_8 and Astro_pos_patterns_5 with similarity score 5.53
Match between Oligo_pos_patterns_10 and L2_3IT_pos_patterns_19 with similarity score 4.45
Match between Oligo_pos_patterns_11 and Oligo_pos_patterns_9 with similarity score 4.91
Match between Oligo_pos_patterns_12 and Oligo_pos_patterns_10 with similarity score 5.77
Match between Oligo_pos_patterns_13 and Oligo_pos_patterns_9 with similarity score 6.60
Match between Oligo_pos_patterns_15 and Oligo_pos_patterns_9 with similarity score 4.81
Match between Oligo_pos_patterns_17 and Oligo_pos_patterns_1 with similarity score 11.72
Reading file modisco_results_ft_2000/Pvalb_modisco_results.h5
Match between Pvalb_neg_patterns_0 and Micro_PVM_neg_patterns_6 with similarity score 4.51
Match between Pvalb_neg_patterns_1 and L5IT_pos_patterns_23 with similarity score 7.85
Match between Pvalb_neg_patterns_9 and L6IT_neg_patterns_13 with similarity score 5.84
Match between Pvalb_pos_patterns_0 and Lamp5_pos_patterns_2 with similarity score 8.80
Match between Pvalb_pos_patterns_2 and Micro_PVM_pos_patterns_2 with similarity score 5.20
Match between Pvalb_pos_patterns_3 and Pvalb_neg_patterns_5 with similarity score 4.64
Match between Pvalb_pos_patterns_4 and Astro_pos_patterns_3 with similarity score 5.14
Match between Pvalb_pos_patterns_5 and L5ET_pos_patterns_9 with similarity score 7.00
Match between Pvalb_pos_patterns_6 and L5_6NP_pos_patterns_11 with similarity score 6.18
Match between Pvalb_pos_patterns_7 and L6IT_pos_patterns_4 with similarity score 6.96
Match between Pvalb_pos_patterns_10 and L2_3IT_pos_patterns_9 with similarity score 7.81
Match between Pvalb_pos_patterns_11 and Micro_PVM_pos_patterns_4 with similarity score 4.86
Match between Pvalb_pos_patterns_12 and L6b_pos_patterns_11 with similarity score 4.76
Match between Pvalb_pos_patterns_14 and Micro_PVM_pos_patterns_4 with similarity score 6.40
Match between Pvalb_pos_patterns_16 and Oligo_pos_patterns_5 with similarity score 7.43
Match between Pvalb_pos_patterns_17 and Pvalb_neg_patterns_6 with similarity score 5.75
Reading file modisco_results_ft_2000/Sncg_modisco_results.h5
Match between Sncg_neg_patterns_2 and L6CT_pos_patterns_2 with similarity score 4.53
Match between Sncg_neg_patterns_4 and Lamp5_neg_patterns_1 with similarity score 7.35
Match between Sncg_neg_patterns_7 and L6b_pos_patterns_12 with similarity score 7.30
Match between Sncg_pos_patterns_0 and Lamp5_pos_patterns_1 with similarity score 4.90
Match between Sncg_pos_patterns_1 and Astro_pos_patterns_7 with similarity score 7.97
Match between Sncg_pos_patterns_2 and Oligo_pos_patterns_0 with similarity score 6.04
Match between Sncg_pos_patterns_3 and Astro_pos_patterns_8 with similarity score 6.32
Match between Sncg_pos_patterns_4 and L6IT_pos_patterns_1 with similarity score 5.33
Match between Sncg_pos_patterns_5 and Pvalb_pos_patterns_0 with similarity score 7.68
Match between Sncg_pos_patterns_6 and L6IT_pos_patterns_1 with similarity score 5.17
Match between Sncg_pos_patterns_7 and L5ET_pos_patterns_1 with similarity score 4.69
Match between Sncg_pos_patterns_9 and L5ET_pos_patterns_3 with similarity score 5.56
Match between Sncg_pos_patterns_11 and L5ET_pos_patterns_12 with similarity score 5.77
Match between Sncg_pos_patterns_12 and Micro_PVM_pos_patterns_11 with similarity score 5.29
Match between Sncg_pos_patterns_13 and Endo_neg_patterns_11 with similarity score 6.34
Match between Sncg_pos_patterns_14 and Sncg_pos_patterns_2 with similarity score 5.32
Match between Sncg_pos_patterns_16 and Pvalb_pos_patterns_0 with similarity score 6.10
Reading file modisco_results_ft_2000/Sst_modisco_results.h5
Match between Sst_neg_patterns_0 and Micro_PVM_neg_patterns_5 with similarity score 4.61
Match between Sst_neg_patterns_1 and L2_3IT_neg_patterns_1 with similarity score 4.62
Match between Sst_neg_patterns_3 and Sncg_pos_patterns_2 with similarity score 6.95
Match between Sst_neg_patterns_6 and L6CT_pos_patterns_2 with similarity score 4.66
Match between Sst_pos_patterns_1 and Pvalb_pos_patterns_3 with similarity score 5.32
Match between Sst_pos_patterns_2 and Sncg_pos_patterns_10 with similarity score 4.93
Match between Sst_pos_patterns_3 and L5ET_pos_patterns_3 with similarity score 5.54
Match between Sst_pos_patterns_4 and Astro_pos_patterns_5 with similarity score 4.97
Match between Sst_pos_patterns_5 and Pvalb_neg_patterns_1 with similarity score 5.26
Match between Sst_pos_patterns_7 and Astro_pos_patterns_7 with similarity score 5.35
Match between Sst_pos_patterns_9 and L6IT_pos_patterns_4 with similarity score 5.50
Match between Sst_pos_patterns_10 and L6b_pos_patterns_11 with similarity score 5.05
Match between Sst_pos_patterns_12 and Pvalb_pos_patterns_5 with similarity score 10.30
Match between Sst_pos_patterns_13 and Sst_pos_patterns_2 with similarity score 5.70
Match between Sst_pos_patterns_14 and Sst_pos_patterns_4 with similarity score 4.34
Match between Sst_pos_patterns_15 and Astro_pos_patterns_4 with similarity score 4.93
Reading file modisco_results_ft_2000/SstChodl_modisco_results.h5
Match between SstChodl_neg_patterns_0 and L6b_neg_patterns_1 with similarity score 4.51
Match between SstChodl_neg_patterns_1 and Sst_neg_patterns_2 with similarity score 5.63
Match between SstChodl_neg_patterns_3 and L2_3IT_neg_patterns_9 with similarity score 4.98
Match between SstChodl_pos_patterns_0 and Sncg_neg_patterns_6 with similarity score 4.93
Match between SstChodl_pos_patterns_3 and Astro_pos_patterns_4 with similarity score 5.84
Match between SstChodl_pos_patterns_4 and L6IT_pos_patterns_4 with similarity score 8.13
Match between SstChodl_pos_patterns_5 and Sst_pos_patterns_3 with similarity score 8.01
Match between SstChodl_pos_patterns_6 and Micro_PVM_pos_patterns_9 with similarity score 4.52
Match between SstChodl_pos_patterns_7 and L2_3IT_pos_patterns_8 with similarity score 4.30
Match between SstChodl_pos_patterns_8 and Endo_pos_patterns_2 with similarity score 8.17
Match between SstChodl_pos_patterns_10 and L6IT_pos_patterns_10 with similarity score 12.76
Match between SstChodl_pos_patterns_13 and SstChodl_pos_patterns_12 with similarity score 4.45
Match between SstChodl_pos_patterns_14 and SstChodl_pos_patterns_5 with similarity score 4.60
Match between SstChodl_pos_patterns_15 and SstChodl_pos_patterns_11 with similarity score 7.72
Reading file modisco_results_ft_2000/VLMC_modisco_results.h5
Match between VLMC_neg_patterns_0 and L6CT_neg_patterns_1 with similarity score 5.57
Match between VLMC_neg_patterns_1 and Endo_neg_patterns_15 with similarity score 5.15
Match between VLMC_neg_patterns_4 and Oligo_neg_patterns_1 with similarity score 5.59
Match between VLMC_neg_patterns_6 and Lamp5_neg_patterns_1 with similarity score 5.76
Match between VLMC_neg_patterns_8 and L6IT_neg_patterns_11 with similarity score 5.24
Match between VLMC_pos_patterns_0 and VLMC_neg_patterns_7 with similarity score 5.73
Match between VLMC_pos_patterns_1 and Endo_pos_patterns_3 with similarity score 5.57
Match between VLMC_pos_patterns_4 and Endo_pos_patterns_12 with similarity score 4.50
Match between VLMC_pos_patterns_5 and Endo_pos_patterns_11 with similarity score 7.41
Match between VLMC_pos_patterns_6 and VLMC_neg_patterns_3 with similarity score 4.93
Match between VLMC_pos_patterns_8 and L6IT_neg_patterns_5 with similarity score 6.25
Match between VLMC_pos_patterns_9 and Endo_pos_patterns_13 with similarity score 9.40
Match between VLMC_pos_patterns_10 and Endo_pos_patterns_16 with similarity score 8.96
Match between VLMC_pos_patterns_11 and Endo_pos_patterns_15 with similarity score 9.76
Match between VLMC_pos_patterns_13 and Oligo_pos_patterns_6 with similarity score 7.30
Match between VLMC_pos_patterns_14 and L2_3IT_neg_patterns_9 with similarity score 5.51
Match between VLMC_pos_patterns_15 and Micro_PVM_pos_patterns_4 with similarity score 7.23
Reading file modisco_results_ft_2000/Vip_modisco_results.h5
Match between Vip_neg_patterns_1 and VLMC_pos_patterns_16 with similarity score 6.69
Match between Vip_neg_patterns_3 and VLMC_pos_patterns_8 with similarity score 5.65
Match between Vip_neg_patterns_9 and VLMC_pos_patterns_8 with similarity score 5.24
Match between Vip_pos_patterns_0 and Vip_neg_patterns_1 with similarity score 5.70
Match between Vip_pos_patterns_1 and Astro_neg_patterns_1 with similarity score 5.17
Match between Vip_pos_patterns_3 and Sncg_pos_patterns_2 with similarity score 6.55
Match between Vip_pos_patterns_4 and Pvalb_pos_patterns_0 with similarity score 8.31
Match between Vip_pos_patterns_5 and SstChodl_pos_patterns_5 with similarity score 4.47
Match between Vip_pos_patterns_6 and SstChodl_pos_patterns_4 with similarity score 6.98
Match between Vip_pos_patterns_7 and Sncg_pos_patterns_13 with similarity score 8.65
Match between Vip_pos_patterns_8 and SstChodl_pos_patterns_13 with similarity score 4.47
Match between Vip_pos_patterns_9 and Astro_pos_patterns_15 with similarity score 5.10
Match between Vip_pos_patterns_10 and Pvalb_pos_patterns_10 with similarity score 7.22
Match between Vip_pos_patterns_11 and Pvalb_pos_patterns_3 with similarity score 5.35
Match between Vip_pos_patterns_12 and L6IT_pos_patterns_10 with similarity score 13.52
Match between Vip_pos_patterns_13 and Vip_pos_patterns_0 with similarity score 5.70
Match between Vip_pos_patterns_15 and Astro_pos_patterns_20 with similarity score 4.88
Merged patterns L6CT_neg_patterns_1 and L6b_neg_patterns_2 with similarity 5.094877107126975
Merged patterns L6CT_neg_patterns_1 and Micro_PVM_neg_patterns_0 with similarity 5.270968075080115
Merged patterns L6CT_neg_patterns_1 and VLMC_neg_patterns_4 with similarity 5.379041467605164
Merged patterns Astro_neg_patterns_1 and VLMC_pos_patterns_8 with similarity 4.675720452100017
Merged patterns Astro_neg_patterns_1 and Oligo_pos_patterns_4 with similarity 4.425873526046771
Merged patterns L6CT_pos_patterns_12 and L6IT_neg_patterns_11 with similarity 5.183835537619857
Merged patterns L6CT_pos_patterns_12 and Vip_neg_patterns_10 with similarity 4.5986451679282085
Merged patterns L6IT_pos_patterns_1 and L5ET_neg_patterns_1 with similarity 4.6555459405261725
Merged patterns L6IT_pos_patterns_1 and L6b_pos_patterns_0 with similarity 5.696936443681456
Merged patterns L6IT_pos_patterns_1 and Oligo_pos_patterns_3 with similarity 5.270968075152102
Merged patterns L6IT_pos_patterns_1 and Vip_pos_patterns_0 with similarity 5.395906666281191
Merged patterns Lamp5_pos_patterns_4 and Vip_pos_patterns_3 with similarity 5.507022743250851
Merged patterns Lamp5_pos_patterns_4 and L6CT_neg_patterns_5 with similarity 4.87367858925861
Merged patterns Lamp5_pos_patterns_4 and Oligo_pos_patterns_1 with similarity 5.507022742104153
Merged patterns Sst_pos_patterns_4 and SstChodl_pos_patterns_2 with similarity 4.742695679843026
Merged patterns Astro_pos_patterns_6 and L6IT_pos_patterns_10 with similarity 8.07884296710021
Merged patterns Astro_pos_patterns_6 and Sncg_pos_patterns_15 with similarity 7.796154597154843
Merged patterns L2_3IT_pos_patterns_10 and L5IT_pos_patterns_20 with similarity 4.4519429215759745
Merged patterns L2_3IT_pos_patterns_10 and Lamp5_pos_patterns_12 with similarity 4.498620493884887
Merged patterns Vip_pos_patterns_15 and Endo_neg_patterns_1 with similarity 4.367088643316549
Merged patterns Vip_pos_patterns_15 and Pvalb_neg_patterns_3 with similarity 4.2839168459840975
Merged patterns Vip_pos_patterns_15 and Vip_pos_patterns_2 with similarity 5.934785707065475
Merged patterns Endo_neg_patterns_2 and Oligo_pos_patterns_9 with similarity 5.12339649524316
Merged patterns Endo_neg_patterns_7 and Micro_PVM_neg_patterns_7 with similarity 4.336503597467887
Merged patterns Endo_neg_patterns_7 and Sncg_neg_patterns_1 with similarity 4.445262626678226
Merged patterns Endo_pos_patterns_1 and L6CT_neg_patterns_4 with similarity 4.509416040951513
Merged patterns Endo_pos_patterns_1 and VLMC_pos_patterns_0 with similarity 9.792283508070868
Merged patterns Vip_pos_patterns_7 and Lamp5_pos_patterns_11 with similarity 4.2962753610968845
Merged patterns Endo_neg_patterns_13 and Sst_neg_patterns_4 with similarity 4.814490946054638
Merged patterns L5_6NP_pos_patterns_4 and Micro_PVM_pos_patterns_0 with similarity 4.534451571403539
Merged patterns L2_3IT_pos_patterns_1 and Pvalb_pos_patterns_0 with similarity 6.716664395364766
Merged patterns Endo_pos_patterns_14 and Pvalb_neg_patterns_9 with similarity 5.705489605207827
Merged patterns VLMC_pos_patterns_11 and Pvalb_pos_patterns_3 with similarity 5.54505235154964
Merged patterns VLMC_pos_patterns_10 and VLMC_pos_patterns_7 with similarity 4.756518858516009
Merged patterns Pvalb_pos_patterns_5 and Pvalb_pos_patterns_10 with similarity 5.081115968905463
Merged patterns Pvalb_pos_patterns_5 and Pvalb_pos_patterns_6 with similarity 5.463786508387722
Merged patterns L2_3IT_neg_patterns_9 and L6IT_neg_patterns_8 with similarity 4.326230214596394
Merged patterns L5ET_pos_patterns_6 and L6CT_pos_patterns_2 with similarity 7.6884771914505
Merged patterns L5ET_pos_patterns_8 and Sst_pos_patterns_10 with similarity 4.805359692443724
Merged patterns L5ET_pos_patterns_15 and Oligo_pos_patterns_5 with similarity 5.507418564477333
Merged patterns L5_6NP_neg_patterns_3 and Pvalb_pos_patterns_17 with similarity 4.758549851453562
Merged patterns L6IT_neg_patterns_7 and L6b_neg_patterns_7 with similarity 4.482745279619613
Merged patterns Micro_PVM_pos_patterns_1 and Micro_PVM_pos_patterns_20 with similarity 4.325897213957992
Merged patterns SstChodl_pos_patterns_6 and Sst_pos_patterns_2 with similarity 4.466491013317669
Merged patterns OPC_pos_patterns_5 and OPC_pos_patterns_14 with similarity 4.672222379854147
Merged patterns OPC_pos_patterns_5 and Oligo_pos_patterns_6 with similarity 5.087321363595291
Merged patterns Oligo_pos_patterns_16 and Pvalb_pos_patterns_15 with similarity 5.705619390860145
Merged patterns Pvalb_pos_patterns_1 and SstChodl_pos_patterns_11 with similarity 4.735961687318295
Merged patterns Sst_pos_patterns_0 and SstChodl_pos_patterns_1 with similarity 4.397944351730109
Iteration 1: Merging complete, checking again
Merged patterns Oligo_pos_patterns_4 and Pvalb_pos_patterns_3 with similarity 5.274709087579346
Merged patterns Vip_pos_patterns_2 and Vip_pos_patterns_14 with similarity 4.391904843638408
Merged patterns Pvalb_pos_patterns_10 and Sncg_pos_patterns_12 with similarity 4.595492670134235
Iteration 2: Merging complete, checking again
Discarded 10 patterns below IC threshold 0.2 and with a single class instance:
['Endo_neg_patterns_4', 'L5_6NP_neg_patterns_7', 'L6CT_neg_patterns_11', 'L6IT_neg_patterns_9', 'L6IT_neg_patterns_15', 'Micro_PVM_neg_patterns_4', 'Vip_neg_patterns_4', 'Vip_neg_patterns_5', 'Vip_neg_patterns_8', 'Vip_neg_patterns_11']
Total iterations: 2

(19, 183)

# Optional: save the matched patterns if you want to re-use them later
with open("modisco_results_ft_2000/all_patterns.pkl", 'wb') as f:
    pickle.dump(all_patterns, f)

Now we can plot a clustermap of cell types/classes and patterns, where the classes are clustered purely on pattern importance with crested.tl.modisco.generate_nucleotide_sequences() and crested.pl.modisco.clustermap()

pat_seqs = crested.tl.modisco.generate_nucleotide_sequences(all_patterns)
crested.pl.modisco.clustermap(
    pattern_matrix,
    list(adata.obs_names),
    width=25,
    height=4.2,
    pat_seqs=pat_seqs,
    grid=True,
    dendrogram_ratio=(0.03, 0.15),
    importance_threshold=5,
)

<seaborn.matrix.ClusterGrid at 0x145cbfbf0110>

../_images/881dab74acbe3bb0a0e3542de1677751b7451d2de7f70e262074d00cb55f3b86.png

If you have the horizontal space for it, you can also add the PWM/contribution logos to the x-axis.

crested.pl.modisco.clustermap_with_pwm_logos(
    pattern_matrix,
    list(adata.obs_names),
    pattern_dict=all_patterns,
    width=50,
    height=4.2,
    grid=True,
    dendrogram_ratio=(0.03, 0.15),
    importance_threshold=5,
    logo_x_multiplier=1,
    logo_height_fraction=0.35,
    logo_y_padding=0.25,
)

<seaborn.matrix.ClusterGrid at 0x145cb7f73890>

../_images/0be1d3ae5f4b43344522d327d7039dcc93a6c675889607722615505122431a19.png

We can also subset to classes we are interested in and want to compare in more detail.

crested.pl.modisco.clustermap_with_pwm_logos(
    pattern_matrix,
    classes=list(adata.obs_names),
    pattern_dict=all_patterns,
    subset=["Astro", "OPC", "Oligo"],
    width=10,
    height=3,
    grid=True,
    logo_height_fraction=0.35,
    logo_y_padding=0.3,
)

2026-02-19T13:59:06.619497+0100 WARNING Argument `figsize` is deprecated since version 2.0.0; please use width and height instead.

<seaborn.matrix.ClusterGrid at 0x145caf7822d0>

../_images/b5d52d1416f884599ef7bf4c0aaa4c63162aef0e9a9bda0a2c9fb6f7d51afcf5.png

crested.pl.modisco.clustermap_with_pwm_logos(
    pattern_matrix,
    classes=list(adata.obs_names),
    subset=["L2_3IT", "L5ET", "L5IT", "L5_6NP", "L6CT", "L6IT", "L6b"],
    pattern_dict=all_patterns,
    width=15,
    height=3,
    grid=True,
    logo_height_fraction=0.35,
    logo_y_padding=0.3,
    importance_threshold=4,
)

2026-02-19T13:59:25.718974+0100 WARNING Argument `figsize` is deprecated since version 2.0.0; please use width and height instead.

<seaborn.matrix.ClusterGrid at 0x145cad750350>

../_images/1dfe692beb473ec6a0461ebbcb97dd1060126dd07b4d5fc61b6eae3d063625bf.png

Additional pattern insights#

It’s always interesting to investigate specific patterns that show in the clustermap above. Here there are some example functions with which to do that.

Plotting patterns based on their indices can be done with crested.pl.modisco.selected_instances():

pattern_indices = [53]
crested.pl.modisco.selected_instances(
    all_patterns, pattern_indices
)  # The pattern that is shown is the most representative pattern of the cluster with the highest average information content (IC)

../_images/b4918566b93cb0f658293c764a60898a9f1aa017efb470ac983b0a991268d76c.png

We can also do a check of pattern similarity:

idx1 = 1
idx2 = 5
sim = crested.tl.modisco.pattern_similarity(all_patterns, idx1, idx2)
print("Pattern similarity is " + str(sim))
crested.pl.modisco.selected_instances(all_patterns, [idx1, idx2])

Pattern similarity is 0.7527711866474396

../_images/5ba1eab5dce1472f8e1ed644701f739f803ab092e9cc84b97d997bafa945b643.png

We can plot all the instances of patterns in the same cluster with crested.pl.modisco.class_instances():

crested.pl.modisco.class_instances(all_patterns, 0)

../_images/fe4414f934446d9e3a34721992043fc7e63b4bbad9375254985fc169bbe0ccbb.png

If you want to find out in which pattern cluster a certain pattern is from your modisco results, you can use the crested.tl.modisco.find_pattern() function.

idx = crested.tl.modisco.find_pattern("OPC_neg_patterns_0", all_patterns)
if idx is not None:
    print("Pattern index is " + str(idx))
    crested.pl.modisco.class_instances(all_patterns, idx, class_representative=True)

Pattern index is 0

../_images/7fa2523f67b00655f099380e0775d6e99e9fb426b54a32119de7bc48f292a13a.png

Finally, we can also plot the similarity between all patterns with crested.tl.modisco.calculate_similarity_matrix() and crested.pl.modisco.similarity_heatmap():

sim_matrix, indices = crested.tl.modisco.calculate_similarity_matrix(all_patterns)
crested.pl.modisco.similarity_heatmap(sim_matrix, indices, fig_size=(42, 17))

2026-02-19T14:39:15.500681+0100 WARNING `fig_size` is deprecated since version 2.0.0; please use arguments `width` and `height` instead.

../_images/bb1cb895c63a910bf0e215a629459930829b98cb336e61590778061e180d294f.png

Matching patterns to TF candidates from scRNA-seq data [Optional]#

To understand the actual transcription factor (TF) candidates binding to the characteristic patterns/potential binding sites per cell type, we can propose potential candidates through scRNA-seq data and a TF-motif collection file.

This analysis requires that you ran tfmodisco-lite with the report function such that each pattern has potential MEME database hits and that you have multiome data. The names in the motif database should match those in the TF-motif collection file.

meme_db, motif_to_tf_file = crested.get_motif_db()

If you haven’t run this yet and using crested.tl.modisco.tfmodisco did not work due to lack of access to TOMTOM, versions of modiscolite above v2.4.0 also support using memelite with argument ttl=True, meaning you can generate the reports this way:

# If you don't have the patterns loaded, load here
with open('modisco_results_ft_2000/all_patterns.pkl', 'rb') as f:
    all_patterns = pickle.load(f)

Load scRNA-seq data#

Load scRNA seq data and calculate mean expression per cell type using crested.tl.modisco.calculate_mean_expression_per_cell_type().

file_path = "crested/Mouse_rna.h5ad"  # Locate h5 file containing scRNAseq data
cell_type_column = "subclass_Bakken_2022"
mean_expression_df = crested.tl.modisco.calculate_mean_expression_per_cell_type(
    file_path, cell_type_column, cpm_normalize=True
)

2026-02-20T13:22:02.547632+0100 INFO Lazily importing module crested.tl. This could take a second...

Please make sure that the classes in the RNA file match those used in CREsted, and rename mean_expression_df’s index if not:

# Rename classes that don't match exactly (due to / being cleaned up to _, etc)
# Here they're in the same (alphabetical) order anyway so we can just zip them
class_mapping = dict(zip(mean_expression_df.index, adata.obs_names, strict=True))
print(class_mapping)

mean_expression_df.index = mean_expression_df.index.map(class_mapping)

{'Astro': 'Astro', 'Endo': 'Endo', 'L2/3 IT': 'L2_3IT', 'L5 ET': 'L5ET', 'L5 IT': 'L5IT', 'L5/6 NP': 'L5_6NP', 'L6 CT': 'L6CT', 'L6 IT': 'L6IT', 'L6b': 'L6b', 'Lamp5': 'Lamp5', 'Micro-PVM': 'Micro_PVM', 'OPC': 'OPC', 'Oligo': 'Oligo', 'Pvalb': 'Pvalb', 'Sncg': 'Sncg', 'Sst': 'Sst', 'Sst Chodl': 'SstChodl', 'VLMC': 'VLMC', 'Vip': 'Vip'}

crested.pl.modisco.tf_expression_per_cell_type(mean_expression_df, ["Nfia", "Spi1", "Mef2c"])

2026-02-20T13:22:12.004209+0100 INFO Lazily importing module crested.pl. This could take a second...

../_images/29a3ba02338ebb32394f1e3bf6a3a2a5b32ce62b575ccfbb42c645bbff44ee38.png

Generating pattern to database motif dictionary#

classes = list(adata.obs_names)
contribution_dir = "modisco_results_ft_2000"
html_paths = crested.tl.modisco.generate_html_paths(all_patterns, classes, contribution_dir)

# p_val threshold to only select significant matches
pattern_match_dict = crested.tl.modisco.find_pattern_matches(
    all_patterns, html_paths, p_val_thr=0.05
)

Loading TF-motif database#

motif_to_tf_df = crested.tl.modisco.read_motif_to_tf_file(motif_to_tf_file)
motif_to_tf_df

	logo	Motif_name	Cluster	Human_Direct_annot	Human_Orthology_annot	Mouse_Direct_annot	Mouse_Orthology_annot	Fly_Direct_annot	Fly_Orthology_annot	Cluster_Human_Direct_annot	Cluster_Human_Orthology_annot	Cluster_Mouse_Direct_annot	Cluster_Mouse_Orthology_annot	Cluster_Fly_Direct_annot	Cluster_Fly_Orthology_annot
0	<img src="https://motifcollections.aertslab.org/v10/logos/bergman__Adf1.png" height="52" alt="bergman__Adf1"></img>	bergman__Adf1	NaN	NaN	NaN	NaN	NaN	Adf1	NaN	NaN	NaN	NaN	NaN	NaN	NaN
1	<img src="https://motifcollections.aertslab.org/v10/logos/bergman__Aef1.png" height="52" alt="bergman__Aef1"></img>	bergman__Aef1	NaN	NaN	NaN	NaN	NaN	Aef1	NaN	NaN	NaN	NaN	NaN	NaN	NaN
2	<img src="https://motifcollections.aertslab.org/v10/logos/bergman__ap.png" height="52" alt="bergman__ap"></img>	bergman__ap	NaN	NaN	NaN	NaN	NaN	ap	NaN	NaN	NaN	NaN	NaN	NaN	NaN
3	<img src="https://motifcollections.aertslab.org/v10/logos/elemento__ACCTTCA.png" height="52" alt="elemento__ACCTTCA"></img>	elemento__ACCTTCA	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
4	<img src="https://motifcollections.aertslab.org/v10/logos/bergman__bcd.png" height="52" alt="bergman__bcd"></img>	bergman__bcd	NaN	NaN	NaN	NaN	NaN	bcd	NaN	NaN	NaN	NaN	NaN	NaN	NaN
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
17990	<img src="https://motifcollections.aertslab.org/v10/logos/elemento__CAAGGAG.png" height="52" alt="elemento__CAAGGAG"></img>	elemento__CAAGGAG	98.3	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
17991	<img src="https://motifcollections.aertslab.org/v10/logos/elemento__TCCTTGC.png" height="52" alt="elemento__TCCTTGC"></img>	elemento__TCCTTGC	98.3	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
17992	<img src="https://motifcollections.aertslab.org/v10/logos/swissregulon__hs__ZNF274.png" height="52" alt="swissregulon__hs__ZNF274"></img>	swissregulon__hs__ZNF274	99.1	ZNF274	NaN	NaN	Zfp369, Zfp110	NaN	NaN	ZNF274	NaN	NaN	Zfp369, Zfp110	NaN	NaN
17993	<img src="https://motifcollections.aertslab.org/v10/logos/swissregulon__sacCer__THI2.png" height="52" alt="swissregulon__sacCer__THI2"></img>	swissregulon__sacCer__THI2	99.2	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
17994	<img src="https://motifcollections.aertslab.org/v10/logos/jaspar__MA0407.1.png" height="52" alt="jaspar__MA0407.1"></img>	jaspar__MA0407.1	99.2	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN

17995 rows × 15 columns

Matching patterns to TF candidates#

We calculate a pattern-tf by cell type matrix which contains the imporatance of each pattern linked to a TF per cell type using crested.tl.modisco.create_pattern_tf_dict() and crested.tl.modisco.create_tf_ct_matrix()

cols = [
    "Mouse_Direct_annot",
    "Mouse_Orthology_annot",
    "Cluster_Mouse_Direct_annot",
    "Cluster_Mouse_Orthology_annot",
]
pattern_tf_dict, all_tfs = crested.tl.modisco.create_pattern_tf_dict(
    pattern_match_dict, motif_to_tf_df, all_patterns, cols
)
tf_ct_matrix, tf_pattern_annots = crested.tl.modisco.create_tf_ct_matrix(
    pattern_tf_dict,
    all_patterns,
    mean_expression_df,
    classes,
    log_transform=True,
    normalize_pattern_importances=False,
    normalize_gex=True,
    min_tf_gex=0.95,
    importance_threshold=5.5,
    pattern_parameter="seqlet_count_log",
    filter_correlation=True,
    verbose=True,
    zscore_threshold=1.5,
    correlation_threshold=0.35,
)

Total columns before threshold filtering: 2845
Total columns after threshold filtering: 300
Total columns removed: 2545
Total columns before correlation filtering: 300
Total columns after correlation filtering: 169
Total columns removed: 131

Finally, we can plot a clustermap of potential pattern-TF matches and their importance per cell type with crested.pl.modisco.clustermap_tf_motif()

crested.pl.modisco.clustermap_tf_motif(
    tf_ct_matrix,
    heatmap_dim="contrib",
    dot_dim="gex",
    class_labels=classes,
    pattern_labels=tf_pattern_annots,
    width=35,
    height=6,
    cluster_rows=True,
    cluster_columns=False,
    xtick_rotation=90,
)

../_images/aea43251e3b49f4447d5b00dd98efcc8407c719301817836c29d8215879cdc0f.png

crested.pl.modisco.clustermap_tf_motif(
    tf_ct_matrix,
    heatmap_dim="contrib",
    dot_dim="gex",
    class_labels=classes,
    subset_classes=["Lamp5", "Sncg", "Vip", "Pvalb", "Sst", "SstChodl"],
    pattern_labels=tf_pattern_annots,
    width=15,
    height=6,
    cluster_rows=False,
    cluster_columns=False,
    xtick_rotation=90,
)

../_images/cc4524dbbef340e958cde531b38426b59405ace61f7bfb9776a0ef36f0d2e223.png