API Reference#
GATAC’s Python API is organized into two namespaces that mirror the preprocessing → analysis workflow.
Preprocessing —
gatac.ppFragment I/O, tile/gene matrix construction, QC metrics, filtering, and feature selection.
Tools —
gatac.tlSpectral embedding, LDA, peak calling, marker peaks, motif enrichment, chromVAR, and GSEA.
Quick overview#
import gatac as ga
# ── Preprocessing ─────────────────────────────────────────────────────────
# 1. Convert fragment TSV → Parquet
ga.pp.make_parquet("sample.tsv.gz")
# 2. Compute QC metrics
metrics = ga.pp.compute_metrics("sample.parquet", "GRCh38.gtf.gz")
# 3. Build tile matrix (filtered by QC)
adata = ga.pp.make_tile_matrix(
"sample.parquet",
chrom_sizes="hg38",
metrics=metrics,
filter_query="tsse_score > 5 and n_unique > 1000",
)
# 4. Select most accessible features
ga.pp.select_features(adata, n_features=500_000)
# ── Tools ──────────────────────────────────────────────────────────────────
# 5. Spectral embedding
ga.tl.spectral(adata)
# 6. Call peaks
ga.tl.call_peaks(adata, groupby="cell_type", parquet_path="sample.parquet")
# 7. Motif enrichment
ga.tl.chromvar(
adata,
genome_fasta="GRCh38.fa",
motifs_path="cisBP_human.meme",
)