gatac.tl.chromvar#
- gatac.tl.chromvar(adata, genome_fasta, motifs_path, *, method='chromvar', n_iterations=50, pvalue=5e-05, check_rc=True, bg='subject', coordinate_system='0-based', batch_size=5000, motif_batch_size=-1, key_added='chromvar', return_adata=False)#
Run the full chromVAR pipeline in a single call.
Executes the following steps in order:
compute_peak_bias — GC content from genome FASTA
sample_bg_peaks — background peak sampling
read_motifs + scan_motifs — motif matching
compute_deviations — TF deviation scores
- Parameters:
- adata
AnnData Peak-level AnnData (cells × peaks).
- genome_fasta
strorPath Path to genome FASTA file.
- motifs_path
strorPath Path to motif file in MEME format.
- method
{"knn", "chromvar"}, default"chromvar" Background sampling method passed to sample_bg_peaks.
- n_iterations
int, default50 Number of background peaks to sample per peak.
- pvalue
float, default5e-5 P-value threshold for motif matching.
- check_rc
bool, defaultTrue Whether to scan both strands.
- bg
strortuple, default"subject" Background nucleotide probabilities for motif scoring.
- coordinate_system
{"0-based", "1-based"}, default"0-based" Coordinate system of peak names in adata.var_names.
- batch_size
int, default5000 Number of cells per GPU batch in compute_deviations.
- motif_batch_size
int, default-1 Number of motifs per chunk in compute_deviations.
- key_added
str, default"chromvar" Key under which the deviation DataFrame is stored in adata.obsm.
- return_adata
bool, defaultFalse If True, also return a new AnnData with deviations as .X.
- adata
- Returns:
None or AnnData Always stores deviations as a DataFrame in adata.obsm[key_added]. Returns an AnnData (cells × motifs) only when return_adata=True.
- Return type:
AnnData | None
Examples
>>> import gatac as ga >>> ga.tl.chromvar( ... peak_adata, ... "../resources/GRCh38.p13.genome.fa", ... "../resources/cisBP_human.meme", ... ) >>> peak_adata.obsm["chromvar"] # DataFrame (cells × motifs)