into2ML/09val.Rmd at main · ehsanx/into2ML · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Model Validation Considerations

```{r setup09, include=FALSE}
require(Publish)
```

## Reliability of outcome labels

- Many clinical outcomes come from reliable data sources.
  - Death data is considered 'hard' outcome, usually come from death registry, and are often not subject to much error.
- However, there are other clinical outcomes that are considered 'soft' outcomes (e.g., subject to mis-classifications, mis-remembering).
  - EDSS scale in multiple sclerosis may be measured differently in different jurisdictions or clinical training practices
  - Number of relapses over past 2 years may be subject to recall bias

```{block, type='rmdcomment'}
Unsurprising, quality of outcome data matters in building a good prediction model.
```

## Robustness of the results

- Blindly believing or reporting the ML results might not be useful. Researchers should always seek for explanation of analysis results. This is particularly true if the results are somewhat **surprising or unexpected**.
- It is always a good idea to check with a person with **clinical expert** to assess and scrutinize the results.
- Analysts should always be skeptic of results when there is a **large gap** between the performances of training (after tuning) and test sets.
- Analyst should always care about **reproducibility and repeatability** of the analysis results (requires keeping good log of what parameter, hyperparameter, computational settings settings were used).
- Indeed internal validity is helpful, but the **external validity** should also be assessed when possible. Often the models developed under laboratory condition may not perform well under real-world clinical settings.