ehrapy.tools.glm#
- ehrapy.tools.glm(adata, var_names=None, formula=None, family='Gaussian', missing='none', as_continuous=None)[source]#
Create a Generalized Linear Model (GLM) from a formula, a distribution, and AnnData.
See https://www.statsmodels.org/stable/generated/statsmodels.formula.api.glm.html#statsmodels.formula.api.glm Internally use the statsmodel to create a GLM Model from a formula, a distribution, and dataframe.
- Parameters:
adata (
AnnData
) – The AnnData object for the GLM model.var_names (
Optional
[Iterable
[str
]]) – A list of var names indicating which columns are for the GLM model.family (
Literal
['Gaussian'
,'Binomial'
,'Gamma'
,'InverseGaussian'
]) – The distribution families. Available options are ‘Gaussian’, ‘Binomial’, ‘Gamma’, and ‘InverseGaussian’. Defaults to ‘Gaussian’.missing (
Literal
['none'
,'drop'
,'raise'
]) – Available options are ‘none’, ‘drop’, and ‘raise’. If ‘none’, no nan checking is done. If ‘drop’, any observations with nans are dropped. If ‘raise’, an error is raised (default: ‘none’).ascontinus – A list of var names indicating which columns are continuous rather than categorical. The corresponding columns will be set as type float.
- Return type:
GLM
- Returns:
The GLM model instance.
Examples
>>> import ehrapy as ep >>> adata = ep.dt.mimic_2(encoded=False) >>> formula = 'day_28_flg ~ age' >>> var_names = ['day_28_flg', 'age'] >>> family = 'Binomial' >>> glm = ep.tl.glmglm(adata, var_names, formula, family, missing = 'drop', ascontinus = ['age'])