{% hint style="warning" %} All extraction module components are currently disabled because they did not pass all the stability checks. {% endhint %}
The Extraction Module is divided into four components. The first three are pages dedicated to simple extraction processes on various data types (images, text notes, and time series) using pre-trained models and/or Python libraries. The fourth component is MEDimage, which is the implementation of a Python open-source package designed for medical image processing and radiomics features extraction.
{% hint style="info" %} In the video below, the "HAIM study" refers to the study of Soenksen et al. {% endhint %}
{% embed url="https://youtu.be/eNKN7H9nwjc?si=zzTJjwQgApcLwKxu" %} Extraction Module Video Tutorial {% endembed %}
Content
- 00:00 Overview
- 00:50 Extraction Images
- 04:24 Extraction Time Series
- 08:29 Extraction Text
- 10:48 Last Word
Data
- Physionet : https://physionet.org/
- MIMIC-IV Demo database : https://physionet.org/content/mimic-iv-demo/2.2/
- MIMIC-IV-Note : https://physionet.org/content/mimic-iv-note/2.2/
- MIMIC-IV-CXR : https://physionet.org/content/mimic-cxr-jpg/2.0.0/
- Study of Soenksen et al. : https://www.nature.com/articles/s41746-022-00689-4
Extraction tools
- TorchXRayVision : https://github.com/mlmed/torchxrayvision
- TSfresh : https://tsfresh.readthedocs.io
- BioBERT : https://arxiv.org/abs/1901.08746
- BioBERT weights : https://github.com/EmilyAlsentzer/clinicalBERT
{% content-ref url="image-extraction-page.md" %} image-extraction-page.md {% endcontent-ref %}
{% content-ref url="text-extraction-page.md" %} text-extraction-page.md {% endcontent-ref %}
{% content-ref url="time-series-extraction-page.md" %} time-series-extraction-page.md {% endcontent-ref %}
{% content-ref url="medimage.md" %} medimage.md {% endcontent-ref %}