gatac.tl.compute_deviations#
- gatac.tl.compute_deviations(adata, *, batch_size=5000, motif_batch_size=-1, key_added='chromvar', return_adata=False)#
Compute chromVAR TF deviation scores.
Computes per-cell, per-motif deviation scores normalized by background expectation. Requires prior setup:
sample_bg_peaks() to generate adata.varm[“bg_peaks”]
scan_motifs() to generate adata.varm[“motif_match”]
The algorithm: - For each cell, computes observed motif accessibility - Computes expected accessibility based on overall peak accessibility and cell depth - For background peaks, computes deviation - Z-score normalizes: (observed_dev - mean_bg_dev) / std_bg_dev
- Parameters:
- adata
AnnData AnnData object with peak matrix (cells × peaks). Must have: - adata.varm[“bg_peaks”]: Background peak indices from sample_bg_peaks() - adata.varm[“motif_match”]: Motif match matrix from scan_motifs() - adata.uns[“motif_name”]: Motif names from scan_motifs()
- batch_size
int, default5000 Number of cells to process at once. Reduce if GPU memory is limited.
- motif_batch_size
int, default-1 Number of motifs to process at once. If -1, uses default of 100 motifs to balance memory usage and speed. Reduce further for very large datasets.
- key_added
str, default"chromvar" Key under which the deviation DataFrame is stored in adata.obsm.
- return_adata
bool, defaultFalse If True, also return a new AnnData with deviations as .X.
- adata
- Returns:
None or AnnData Always stores deviations as a DataFrame in adata.obsm[key_added]. Returns an AnnData (cells × motifs) only when return_adata=True.
- Return type:
AnnData | None
Examples
>>> import gatac as ga >>> >>> # 1. Create peak matrix >>> peak_adata = ga.tl.make_peak_matrix(tile_adata, parquet_path) >>> >>> # 2. Compute biases and sample background >>> ga.tl.compute_peak_bias(peak_adata, "genome.fa") >>> ga.tl.sample_bg_peaks(peak_adata) >>> >>> # 3. Scan motifs >>> motifs = ga.tl.read_motifs("motifs.meme") >>> ga.tl.scan_motifs(peak_adata, motifs, "genome.fa") >>> >>> # 4. Compute deviations (stored in peak_adata.obsm["chromvar"]) >>> ga.tl.compute_deviations(peak_adata) >>> peak_adata.obsm["chromvar"] # DataFrame (cells × motifs)