longitudinal_ecg_analysis.dataset_curators package

Submodules

longitudinal_ecg_analysis.dataset_curators.curate_dataset_hh module

curate_dataset_hh.py

A blank file - to be filled in later.

longitudinal_ecg_analysis.dataset_curators.curate_dataset_hh.curate_dataset_hh()

placeholder function

longitudinal_ecg_analysis.dataset_curators.curate_dataset_mcmed module

curate_dataset_mcmed.py

Curates the MC-MED dataset for analysis.

longitudinal_ecg_analysis.dataset_curators.curate_dataset_mcmed.check_mcmed_dataset_files(settings)

Check the presence of required files and directories for the MC-MED dataset.

This function verifies that the expected input files and folders exist in the directory specified by settings[“paths”][“dataset_root_raw_folder”]. It checks for:

  • A CSV file named ‘visits.csv’

  • A CSV file named ‘waveform_summary.csv’

  • A folder named ‘waveforms’

These are essential for processing the dataset. If any of the required files or folders are missing, a FileNotFoundError is raised.

Parameters:

settings (dict) – A dictionary containing file path settings. It must include “dataset_root_raw_folder” under ‘settings[“paths”]’.

Returns:

The updated settings dictionary with paths for the necessary

files and folders added.

Return type:

dict

Raises:

FileNotFoundError – If any required file or folder does not exist.

longitudinal_ecg_analysis.dataset_curators.curate_dataset_mcmed.compute_past_future_flags(group)
longitudinal_ecg_analysis.dataset_curators.curate_dataset_mcmed.create_var_info()
longitudinal_ecg_analysis.dataset_curators.curate_dataset_mcmed.curate_dataset_mcmed(settings)

Curate the MC-MED dataset for analysis. This is a freely available dataset, available at: https://doi.org/10.13026/xgx1-7x47

Parameters:

settings (dict) – Dataset settings loaded from a settings file.

Returns:

None. Writes the prepared data to disk.

longitudinal_ecg_analysis.dataset_curators.curate_dataset_mcmed.extract_standard_dataset_variables(settings)

Extract dataset variables in a standardised format

Parameters:

settings (a dict of settings)

Returns:

Writes the following prepared data files to disk: …

longitudinal_ecg_analysis.dataset_curators.curate_dataset_mcmed.identify_waveform_recording_file_paths(rec_link_with_rec_id_orig, settings)

Create file paths for waveform recordings and save them to a CSV.

Parameters:

settings (dict) – Dictionary containing paths.

Returns:

longitudinal_ecg_analysis.dataset_curators.curate_dataset_mcmed.merge_df_waves_into_df(df_waves, df)
longitudinal_ecg_analysis.dataset_curators.curate_dataset_mcmed.reformat_variables(df, up)

longitudinal_ecg_analysis.dataset_curators.curate_dataset_music module

curate_dataset_music.py

Curates the MUSIC (Sudden Cardiac Death in Chronic Heart Failure) dataset for analysis.

longitudinal_ecg_analysis.dataset_curators.curate_dataset_music.check_music_dataset_files(settings)

Check the presence of required files and directories for the MUSIC dataset.

This function verifies that the expected input files and folders exist in the directory specified by settings[“paths”][“input_dir”]. It checks for:

  • A CSV file named ‘subject-info.csv’

  • A folder named ‘Holter_ECG’

These are essential for processing the MUSIC dataset. If any of the required files or folders are missing, a FileNotFoundError is raised.

Parameters:

settings (dict) – A dictionary containing file path settings. It must include ‘input_dir’ under ‘settings[“paths”]’.

Returns:

The updated settings dictionary with paths for ‘subj-info-csv’ and

’holter_ecg_folder’ added to settings[“paths”].

Return type:

dict

Raises:

FileNotFoundError – If any required file or folder does not exist.

longitudinal_ecg_analysis.dataset_curators.curate_dataset_music.curate_dataset_music(settings)

Curate the MUSIC (Sudden Cardiac Death in Chronic Heart Failure) dataset for analysis. The dataset is available at: https://doi.org/10.13026/fa8p-he52

Parameters:

settings (dict) – Dataset settings loaded from a settings file.

Returns:

None. Writes the prepared data to disk.

longitudinal_ecg_analysis.dataset_curators.curate_dataset_music.extract_standard_dataset_variables(settings)

Extract clinical, outcome and dataset variables in a standardised format

Parameters:

settings (a dict of settings, including settings["paths"]["subj-info-csv"] - the path of subject-info.csv)

Returns:

Writes the following prepared data files to disk: standard-clinical-metrics.csv : A CSV file containing clinical metrics for each subject. standard-outcome-variables.csv : A CSV file containing outcome variables for each subject. standard-dataset-variables.csv : A CSV file containing variables describing the dataset.

longitudinal_ecg_analysis.dataset_curators.curate_dataset_music.identify_ECG_recording_file_paths(settings)

Create file paths for ECG recordings and save them to a CSV.

Parameters:

settings (dict) – Dictionary containing paths, including ‘standard-clinical-metrics-csv’, ‘holter_ecg_folder’ and ‘signal-filepaths-csv’.

Returns:

DataFrame with columns ‘subj_id’ and ‘filepath’.

Return type:

pd.DataFrame

Module contents