gatac.tl.merge_peaks

Contents

gatac.tl.merge_peaks#

gatac.tl.merge_peaks(peaks, chrom_sizes=None, half_width=250, use_rep='gmacs', key_added='peaks', inplace=True)#

Merge peaks from different groups into fixed-width, non-overlapping peaks.

This mirrors the behavior of SnapATAC2’s merge_peaks.

The algorithm expands each peak summit by half_width on both sides, sorts peaks by genomic position, groups overlapping or adjacent intervals, and then iteratively keeps the most significant peak in each overlap group while discarding the overlapping alternatives.

This algorithm matches SnapATAC2’s Rust implementation which uses merge_sorted_bed_with to group overlapping intervals before applying iterative_merge.

Parameters:
peaks Union[dict[str, pd.DataFrame], 'AnnData']

Peak information from different groups. Either a dict mapping group names to pandas DataFrames with peak info, or an AnnData object containing peaks in .uns[use_rep].

chrom_sizes Optional[Union[str, dict[str, int]]]

Chromosome sizes. If a string is provided, it is interpreted as a genome name passed to get_chrom_sizes. If peaks is an AnnData and chrom_sizes is None, will try to infer from adata.uns[‘reference_sequences’].

half_width int

Half width of the merged peaks.

use_rep str

When peaks is an AnnData, key in .uns containing peak information.

key_added str

When peaks is an AnnData and inplace=True, key in .uns to store merged peaks.

inplace bool

When peaks is an AnnData, whether to store results in .uns[key_added].

Returns:

pd.DataFrame or None A dataframe with merged, fixed-width, non-overlapping peaks. If peaks is an AnnData and inplace=True, returns None and stores in .uns[key_added].

Return type:

Optional[pd.DataFrame]

Examples

>>> import gatac as ga
>>> # Operate in-place on an AnnData: stores results in adata.uns["peaks"]
>>> ga.tl.merge_peaks(adata, use_rep="gmacs", key_added="peaks")
>>> # Or pass a dict of per-group peak DataFrames:
>>> peaks_df = ga.tl.merge_peaks(
...     adata.uns["gmacs"],
...     chrom_sizes="hg38",
...     half_width=250,
... )