ehrapy.preprocessing.simple_impute#
- ehrapy.preprocessing.simple_impute(adata, var_names=None, strategy='mean', copy=False, warning_threshold=30)[source]#
Impute missing values in numerical data using mean/median/most frequent imputation.
- Parameters:
adata (
AnnData
) – The annotated data matrix to impute missing values on.var_names (
Optional
[Iterable
[str
]]) – A list of column names to apply imputation on (if None, impute all columns).strategy (
Literal
['mean'
,'median'
,'most_frequent'
]) – Imputation strategy to use. One of {‘mean’, ‘median’, ‘most_frequent’}.warning_threshold (
int
) – Display a warning message if percentage of missing values exceeds this threshold. Defaults to 30.copy (
bool
) – Whether to return a copy of adata or modify it inplace. Defaults to False.
- Return type:
- Returns:
An updated AnnData object with imputed values.
- Raises:
ValueError – If the selected imputation strategy is not applicable to the data.
ValueError – If an unknown imputation strategy is provided.
Examples
>>> import ehrapy as ep >>> adata = ep.dt.mimic_2(encoded=True) >>> ep.pp.simple_impute(adata, strategy="median")