ehrapy.io.read_csv¶
- ehrapy.io.read_csv(dataset_path, sep=',', index_column=None, columns_obs_only=None, columns_x_only=None, return_dfs=False, cache=False, backup_url=None, download_dataset_name=None, archive_format=None, **kwargs)[source]¶
Reads or downloads a desired directory of csv/tsv files or a single csv/tsv file.
- Parameters:
dataset_path (
Path
|str
) – Path to the file or directory to read.sep (
str
) – Separator in the file. One of either , (comma) or (tab). Defaults to , (comma)index_column (
dict
[str
,str
|int
] |str
|int
|None
) – The index column of obs. Usually the patient visit ID or the patient ID.columns_obs_only (
dict
[str
,list
[str
]] |list
[str
] |None
) – These columns will be added to obs only and not X.columns_x_only (
dict
[str
,list
[str
]] |list
[str
] |None
) – These columns will be added to X only and all remaining columns to obs. Note that datetime columns will always be added to .obs though.return_dfs (
bool
) – Whether to return one or several Pandas DataFrames.cache (
bool
) – Whether to write to cache when reading or not. Defaults to False.download_dataset_name (
str
|None
) – Name of the file or directory after download.backup_url (
str
|None
) – URL to download the data file(s) from, if the dataset is not yet on disk.is_archive – Whether the downloaded file is an archive.
- Return type:
- Returns:
An
AnnData
object or a dict with an identifier (the filename, without extension) for eachAnnData
object in the dict
Examples
>>> import ehrapy as ep >>> adata = ep.io.read_csv("myfile.csv")