ehrapy.io.read_csv#
- ehrapy.io.read_csv(dataset_path, sep=',', index_column=None, columns_obs_only=None, columns_x_only=None, return_dfs=False, cache=False, backup_url=None, download_dataset_name=None, archive_format=None, **kwargs)[source]#
Reads or downloads a desired directory of csv/tsv files or a single csv/tsv file.
- Parameters:
dataset_path (
Path
|str
) – Path to the file or directory to read.sep (
str
) – Separator in the file. One of either , (comma) or (tab). Defaults to , (comma)index_column (
Union
[dict
[str
,str
|int
],str
,int
,None
]) – The index column of obs. Usually the patient visit ID or the patient ID.columns_obs_only (
Union
[dict
[str
,list
[str
]],list
[str
],None
]) – These columns will be added to obs only and not X.columns_x_only (
Union
[dict
[str
,list
[str
]],list
[str
],None
]) – These columns will be added to X only and all remaining columns to obs. Note that datetime columns will always be added to .obs though.return_dfs (
bool
) – Whether to return one or several Pandas DataFrames.cache (
bool
) – Whether to write to cache when reading or not. Defaults to False .download_dataset_name (
Optional
[str
]) – Name of the file or directory after download.backup_url (
Optional
[str
]) – URL to download the data file(s) from, if the dataset is not yet on disk.is_archive – Whether the downloaded file is an archive.
- Return type:
- Returns:
An
AnnData
object or a dict with an identifier (the filename, without extension) for eachAnnData
object in the dict
Examples
>>> import ehrapy as ep >>> adata = ep.io.read_csv("myfile.csv")