gatac.pp.select_features

gatac.pp.select_features#

gatac.pp.select_features(adata, n_features=500000, filter_lower_quantile=0.005, filter_upper_quantile=0.005, inplace=True, output_path=None)#

GPU-accelerated feature selection for ATAC-seq tile matrices.

For binary matrices: selects top N most accessible features (ArchR approach). For count matrices: selects top N accessible features, excluding quantile tails.

Parameters:
adata AnnData

Annotated data matrix with cells × features

n_features int

Target number of features to select (default: 500000)

filter_lower_quantile float

Lower quantile threshold for filtering (ignored for binary matrices) (default: 0.005)

filter_upper_quantile float

Upper quantile threshold for filtering (ignored for binary matrices) (default: 0.005)

inplace bool

Whether to modify adata in place (default: True)

output_path str or Path, optional

If provided, save the result to this path

Returns:

adata : AnnData or None Modified AnnData if inplace=False, else None

Examples

>>> import gatac as ga
>>> ga.pp.select_features(adata, n_features=500_000)
>>> # adata.var["selected"] is now a boolean mask of the chosen features