ehrapy.preprocessing.qc_metrics¶
- ehrapy.preprocessing.qc_metrics(adata, qc_vars=(), layer=None)[source]¶
Calculates various quality control metrics.
Uses the original values to calculate the metrics and not the encoded ones. Look at the return type for a more in depth description of the calculated metrics.
- Parameters:
adata (
AnnData
) – Annotated data matrix.qc_vars (
Collection
[str
]) – Optional List of vars to calculate additional metrics for.layer (
str
) – Layer to use to calculate the metrics.
- Return type:
- Returns:
Two Pandas DataFrames of all calculated QC metrics for obs and var respectively.
Observation level metrics include:
missing_values_abs: Absolute amount of missing values.
missing_values_pct: Relative amount of missing values in percent.
Feature level metrics include:
missing_values_abs: Absolute amount of missing values.
missing_values_pct: Relative amount of missing values in percent.
mean: Mean value of the features.
median: Median value of the features.
std: Standard deviation of the features.
min: Minimum value of the features.
max: Maximum value of the features.
Examples
>>> import ehrapy as ep >>> adata = ep.dt.mimic_2(encoded=True) >>> obs_qc, var_qc = ep.pp.qc_metrics(adata) >>> obs_qc["missing_values_pct"].plot(kind="hist", bins=20)