gatac.tl.compute_peak_bias

gatac.tl.compute_peak_bias#

gatac.tl.compute_peak_bias(adata, genome_fasta, *, add_gc_content=True, add_cpg_density=False)#

Compute peak biases (GC content and/or CpG density) for background sampling.

This function adds bias columns to adata.var that are used by sample_bg_peaks to match foreground and background peaks.

Parameters:
adata AnnData

AnnData object with peak matrix. Peak names in adata.var_names should be in “chr:start-end” format.

genome_fasta str or Path

Path to genome FASTA file (supports .fa, .fasta, .fa.gz, .fasta.gz)

add_gc_content bool, default True

Whether to compute GC content

add_cpg_density bool, default False

Whether to compute CpG density

Returns:

None Adds columns to adata.var: - “gc_content”: GC content (if add_gc_content=True) - “cpg_density”: CpG density (if add_cpg_density=True)

Return type:

None

Examples

>>> import gatac as ga
>>> ga.tl.compute_peak_bias(peak_adata, "genome.fa")
>>> peak_adata.var["gc_content"]  # GC content per peak