ehrapy.preprocessing.mice_forest_impute¶
- ehrapy.preprocessing.mice_forest_impute(adata, var_names=None, *, warning_threshold=70, save_all_iterations=True, random_state=None, inplace=False, iterations=5, variable_parameters=None, verbose=False, copy=False)[source]¶
Impute data using the miceforest.
See https://github.com/AnotherSamWilson/miceforest Fast, memory efficient Multiple Imputation by Chained Equations (MICE) with lightgbm.
- Parameters:
adata (
AnnData
) – The AnnData object containing the data to impute.var_names (
Iterable
[str
] |None
) – A list of variable names to impute. If None, impute all variables.warning_threshold (
int
) – Threshold of percentage of missing values to display a warning for. Defaults to 70.save_all_iterations (
bool
) – Whether to save all imputed values from all iterations or just the latest. Saving all iterations allows for additional plotting, but may take more memory. Defaults to True.random_state (
int
|None
) – The random state ensures script reproducibility. Defaults to None.inplace (
bool
) – If True, modify the input AnnData object in-place and return None. If False, return a copy of the modified AnnData object. Default is False.iterations (
int
) – The number of iterations to run. Defaults to 5.variable_parameters (
dict
|None
) – Model parameters can be specified by variable here. Keys should be variable names or indices, and values should be a dict of parameter which should apply to that variable only. Defaults to None.verbose (
bool
) – Whether to print information about the imputation process. Defaults to False.copy (
bool
) – Whether to return a copy of the AnnData object or modify it in-place. Defaults to False.
- Return type:
- Returns:
The imputed AnnData object.
Examples
>>> import ehrapy as ep >>> adata = ep.dt.mimic_2(encoded=True) >>> ep.ad.infer_feature_types(adata) >>> ep.pp.mice_forest_impute(adata)