thumbnail image

SG HEALTHCARE AI

DATATHON & EXPO

  • Home
  • …  
    • Home

SG HEALTHCARE AI

DATATHON & EXPO

  • Home
  • …  
    • Home

SG HEALTHCARE AI

DATATHON & EXPO

  • Data for the Datathon

    De-identified Real-world Healthcare Datasets

  • Datasets

     

    Reminder: Teams need to apply and obtain access to the datasets they intend to use before the datathon.

     

    1. Electrical Medical Records Datasets

    During the datathon, teams will have access to 3 de-identified EMR datasets. Teams may choose to use one or all of these datasets to answer their clinical questions. In particular, these three datasets are: 1) the Medical Information Mart for Intensive Care (MIMIC)-IV Database from Physionet 2) the Philips eICU Collaborative Research Database (https://eicu-crd.mit.edu/) 3) the VitalDB dataset (https://vitaldb.net/dataset/) . These three databases share similar data schemas. They contain hourly physiologic readings from bedside monitors, validated by ICU nurses. They also contain records of demographics, labs, nursing progress notes, discharge summaries, IV medications, fluid balance, and other clinical variables.

    MIMIC-IV Dataset

    Introduction & Access Application: https://mimic-iv.mit.edu/

     

    Github repository: https://github.com/MIT-LCP/mimic-iv

     

    Documentation: https://mimic-iv.mit.edu/docs/

     

    When using this resource, please cite:
    Johnson, A., Bulgarelli, L., Pollard, T., Horng, S., Celi, L. A., & Mark, R. (2020). MIMIC-IV (version 0.4). PhysioNet. https://doi.org/10.13026/a3wn-hq05.


    Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.

    eICU-CRD Dataset

    Introduction & Documentation: https://eicu-crd.mit.edu/about/eicu/

     

    Github repository: https://github.com/mit-eicu/eicu-code

     

    Example code: https://github.com/mit-eicu/eicu-code/blob/master/concepts/icustay_detail.sql

     

    When using this resource, please cite:
    Pollard, T., Johnson, A., Raffa, J., Celi, L. A., Badawi, O., & Mark, R. (2019). eICU Collaborative Research Database (version 2.0). PhysioNet. https://doi.org/10.13026/C2WM1R.

     

    The eICU Collaborative Research Database, a freely available multi-center database for critical care research. Pollard TJ, Johnson AEW, Raffa JD, Celi LA, Mark RG and Badawi O. Scientific Data (2018). DOI: http://dx.doi.org/10.1038/sdata.2018.178.

     

    Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.

    VitalDB Dataset

    Introduction & Documentation: https://vitaldb.net/dataset/

     

    Data Summary: https://vitaldb.net/dataset/?query=overview&documentId=13qqajnNZzkN7NZ9aXnaQ-47NWy7kx-a6gbrcEsi-gak&sectionId=h.1fo5zknztqnw

     

    API Documentation: https://vitaldb.net/docs/

     

    Example code: https://github.com/vitaldb/examples/

     

    When using this resource, please cite:

    Lee, Hyung-Chul, and Chul-Woo Jung. "Vital Recorder—a free research tool for automatic recording of high-resolution time-synchronised physiological data from multiple anaesthesia devices." Scientific reports 8.1 (2018): 1-8.

    National Sleep Research Resource Datasets

    Introduction & Documentation: https://sleepdata.org/about

     

    Data Summary: https://sleepdata.org/datasets

     

    API Documentation: https://sleepdata.org/tools

     

    Forum : https://sleepdata.org/forum

     

    !!! Note: These datasets required individual registration and approval from NSRR. Thus, we will not be able to host these datasets on our cloud for every team. Teams that would like to use these datasets please remember to apply for approval and would need to host the datasets locally themselves. Thanks for your understanding in advance.

  • 2. Medical Imaging Datasets

    Wi will provide a large collection of medical image datasets will be provided to all teams.

    MIMIC CXR Dataset

    Introduction & Documentation: https://physionet.org/content/mimic-cxr/2.0.0/

     

    When using this resource, please cite:

    Johnson AE, Pollard TJ, Berkowitz SJ, Greenbaum NR, Lungren MP, Deng CY, Mark RG, Horng S. MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Scientific Data. 2019;6.

    3D Medical Image Dataset from Medical Segmentation Decatholon

    Introduction & Documentation: http://medicaldecathlon.com/

     

    When using this resource, please cite:

    https://arxiv.org/abs/1902.09063

    NIH Chest X-ray dataset

    Introduction & Documentation: https://www.kaggle.com/nih-chest-xrays/data

     

    When using this resource, please cite:

    http://openaccess.thecvf.com/content_cvpr_2017/papers/Wang_ChestX-ray8_Hospital-Scale_Chest_CVPR_2017_paper.pdf(link is external)

    AutoImplant2020

    Introduction & Documentation: https://autoimplant.grand-challenge.org/

     

    When using this resource, please cite:

    https://arxiv.org/pdf/2006.00980.pdf

    VerSe2019

    Introduction & Documentation: https://osf.io/nqjyw/

     

    When using this resource, please cite:

    https://arxiv.org/pdf/2001.09193.pdf

    VerSe2020

    Introduction & Documentation: https://osf.io/t98fz/

     

    When using this resource, please cite:

    https://arxiv.org/pdf/2001.09193.pdf

    MICCAI 2020 RibFrac Challenge: Rib Fracture Detection and Classification

    Introduction & Documentation: https://ribfrac.grand-challenge.org/

     

    When using this resource, please cite:

    Deep-Learning-Assisted Detection and Segmentation of Rib Fractures from CT Scans: Development and Validation of FracNet(In press)

    EMIDEC automatic Evaluation of Myocardial Infarction from Delayed-Enhancement Cardiac MRI

    Introduction & Documentation: http://emidec.com/

     

    When using this resource, please cite:

    https://www.mdpi.com/2306-5729/5/4/89

    Multimodal Brain Tumor Segmentation Challenge 2020: Data

    Introduction & Documentation: https://www.med.upenn.edu/cbica/brats2020/data.html

     

    When using this resource, please cite:

    https://pubmed.ncbi.nlm.nih.gov/25494501/

    https://pubmed.ncbi.nlm.nih.gov/28872634/

    https://arxiv.org/abs/1811.02629

    HECKTOR challenge

    Introduction & Documentation: https://www.aicrowd.com/challenges/miccai-2020-hecktor

     

    When using this resource, please cite: https://github.com/voreille/hecktor

     

    Chest X-Ray Images (Pneumonia)

    Introduction & Documentation: https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia

     

    When using this resource, please cite: http://www.cell.com/cell/fulltext/S0092-8674(18)30154-5

     

    Diabetic Retinopathy Detection

    Introduction & Documentation: https://www.kaggle.com/c/diabetic-retinopathy-detection/data

     

    When using this resource, please cite: http://www.eyepacs.com/

     

    Messidor

    Introduction & Documentation: http://www.adcis.net/en/third-party/messidor/

     

    When using this resource, please cite:

    http://www.ias-iss.org/ojs/IAS/article/view/1155
    http://dx.doi.org/10.5566/ias.1155.

     

    SARAS

    Introduction & Documentation: https://saras-mesad.grand-challenge.org/

     

    When using this resource, please cite:

    https://arxiv.org/abs/2104.03178
    https://arxiv.org/abs/2006.07164

     

    Standford CheXpet dataset

    Introduction & Documentation: https://stanfordmlgroup.github.io/competitions/chexpert/

     

    When using this resource, please cite:

    https://arxiv.org/abs/1901.07031?utm_medium=email&utm_source=transaction

    Chula RBC-12-Dataset

    Introduction & Documentation: https://github.com/Chula-PIC-Lab/Chula-RBC-12-Dataset

     

    When using this resource, please cite:

    https://arxiv.org/abs/2012.01321

     

SINGAPORE HEALTHCARE AI DATATHON AND EXPO 2021

    Cookie Use
    We use cookies to ensure a smooth browsing experience. By continuing we assume you accept the use of cookies.
    Learn More