gatac.tl.sample_bg_peaks#
- gatac.tl.sample_bg_peaks(adata, *, method='knn', n_iterations=50, bg_columns=['gc_content', 'reads_per_peak'], genome_fasta=None, n_neighbors=50, bs=50, w=0.1)#
Sample background peaks for chromVAR analysis.
This function matches foreground peaks with background peaks that have similar biases (e.g., GC content and accessibility). Two methods are supported:
“knn” (default): GPU-accelerated k-NN using cuML. Faster and recommended.
“chromvar”: Original chromVAR binning method. Slower but faithful to R package.
- Parameters:
- adata
AnnData AnnData object with peak matrix. Must have bias columns in adata.var (e.g., from compute_peak_bias).
- method
{"knn", "chromvar"}, default"knn" Background sampling method: - “knn”: cuML nearest neighbors (GPU, faster) - “chromvar”: Original chromVAR binning (CPU, slower)
- n_iterations
int, default50 Number of background peaks to sample per peak
- bg_columns
list[str], default[``”gc_content”, ``"reads_per_peak"] Columns in adata.var to use for bias matching. Any column listed here that is absent from adata.var will be computed automatically when genome_fasta is provided.
- genome_fasta
strorPath, optional Path to genome FASTA file. Required when bg_columns contains “gc_content” and it has not been precomputed.
- n_neighbors
int, default50 Number of neighbors for k-NN method (only used if method=”knn”)
- bs
int, default50 Bin size for chromVAR method (only used if method=”chromvar”)
- w
float, default0.1 Gaussian kernel width for chromVAR method (only used if method=”chromvar”)
- adata
- Returns:
None Adds adata.varm[“bg_peaks”] with shape (n_peaks, n_iterations) containing background peak indices for each peak.
- Return type:
None
Examples
>>> import gatac as ga >>> # Option A: precompute biases separately >>> ga.tl.compute_peak_bias(peak_adata, "genome.fa") >>> ga.tl.sample_bg_peaks(peak_adata, method="knn") >>> >>> # Option B: let sample_bg_peaks compute gc_content on the fly >>> ga.tl.sample_bg_peaks(peak_adata, method="knn", genome_fasta="genome.fa") >>> peak_adata.varm["bg_peaks"] # Background indices