This repo contains all code supporting the analysis for our manuscript "Distributional bias compromises leave-one-out cross-validation".
| Folder/file | Description |
|---|---|
Simulations.py |
This script runs all of the simulation analyses referenced in our mansucript . |
CFS |
Contains the code used to replicated the original analysis from the CFS analysis by Vogl et al., (https://www.science.org/doi/10.1126/sciadv.abq2422), and with our additional RLOOCV implementation. All data used within this folder is available under the original publication, from https://www.science.org/doi/suppl/10.1126/sciadv.abq2422/suppl_file/sciadv.abq2422_supporting_data_and_code.zip |
PTB |
Contains the code used to benchmark LOOCV and RLOOCV linear models on data from Fettweiss et al. (https://www.nature.com/articles/s41591-019-0450-2), using the publicly available processed data from Huang et al. (https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-023-01702-2). The data used in this folder is available for download at https://github.com/hczdavid/metaManuscript/tree/main/Analyses/Data |
ICI-CD4 |
Contains the code used to replicated the original analysis from the ICI-CD4 analysis by Lozano et al., (https://www.nature.com/articles/s41591-021-01623-z), and with our additional RLOOCV implementation. All data used within this folder is available under the original publication, along with the code used to replicate the original work's findings. The original data is available from https://doi.org/10.25936/f3np-k536, under the "Supporting code is available here." download option |
| delong.py | code to run the delong test, this file was obtained from https://github.com/yandexdataschool/roc_comparison |
| simulations_with_signal | Contains code to implement and evaluate LOOCV and RLOOCV in generated datasets with some true signal between X and y. All data used is downloaded within the analysis script, using the UCI Machine Learning Repository python package |
| plots-latest | This folder contains all images, figures, and tables referenced within our manuscript. All code executed from all analyses write outputs into this folder |