crested.tl.modisco.get_pwms_from_modisco_file

crested.tl.modisco.get_pwms_from_modisco_file#

crested.tl.modisco.get_pwms_from_modisco_file(modisco_file, min_ic=0.1, output_meme_file=None, metacluster_name=None, pattern_indices=None)#

Extract PPMs (Position Probability Matrices) from a Modisco HDF5 results file.

Optionally, save the extracted PPMs in MEME format.

Parameters:
  • modisco_file (str) – Path to the Modisco HDF5 results file.

  • min_ic (float) – Threshold to trim pattern. The higher, the more it gets trimmed.

  • output_meme_file (str | None) – Path to save the extracted PPMs in MEME format. If None, PPMs are not saved.

  • metacluster_name (str | None) – The name of the metacluster to process (e.g., ‘pos_patterns’ or ‘neg_patterns’). If None, all metaclusters are processed.

  • pattern_indices (list[int] | None) – List of pattern indices to include from the selected metacluster. If None, all patterns are processed.

Returns:

dict[str, np.ndarray] A dictionary where keys are pattern IDs (e.g., “pos_patterns_pattern_0”) and values are numpy arrays of PPMs.