crested.pp.normalize_peaks#
- crested.pp.normalize_peaks(adata, peak_threshold=0, gini_std_threshold=1.0, top_k_percent=0.01, inplace=True)#
Normalize the adata.X based on variability of the top values per cell type.
This function applies a normalization factor to each cell type, focusing on regions with the most significant peaks above a defined threshold and considering the variability within those peaks. Only used on continuous .X data. Modifies the input AnnData.X in place if
inplace=True.- Parameters:
adata (
AnnData) – The AnnData object containing the matrix (celltypes, regions) to be normalized.peak_threshold (
int(default:0)) – The minimum value for a peak to be considered significant for the Gini score calculation.gini_std_threshold (
float(default:1.0)) – The number of standard deviations below the mean Gini score used to determine the threshold for low variability.top_k_percent (
float(default:0.01)) – The percentage (expressed as a fraction) of top values to consider for Gini score calculation.inplace (
bool(default:True)) – Perform computation and modifyadatain-place or return a resulting copy of theadatainstead.
- Return type:
- Returns:
If
inplace=True(default), modifies the AnnData in-place with the normalized matrix and normalization weights saved toadata.obsm['weights'], and returns the filtered .var of the significant peaks, as a DataFrame. Ifinplace=False, returns (adata, filtered_df): a modified copy of the AnnData object instead, along with a the filtered .var of the significant peaks, as a DataFrame.
See also
Example
>>> crested.pp.normalize_peaks( ... adata, ... peak_threshold=0, ... gini_std_threshold=2.0, ... top_k_percent=0.05, ... )