-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy path09val.Rmd
More file actions
25 lines (19 loc) · 1.64 KB
/
09val.Rmd
File metadata and controls
25 lines (19 loc) · 1.64 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Model Validation Considerations
```{r setup09, include=FALSE}
require(Publish)
```
## Reliability of outcome labels
- Many clinical outcomes come from reliable data sources.
- Death data is considered 'hard' outcome, usually come from death registry, and are often not subject to much error.
- However, there are other clinical outcomes that are considered 'soft' outcomes (e.g., subject to mis-classifications, mis-remembering).
- EDSS scale in multiple sclerosis may be measured differently in different jurisdictions or clinical training practices
- Number of relapses over past 2 years may be subject to recall bias
```{block, type='rmdcomment'}
Unsurprising, quality of outcome data matters in building a good prediction model.
```
## Robustness of the results
- Blindly believing or reporting the ML results might not be useful. Researchers should always seek for explanation of analysis results. This is particularly true if the results are somewhat **surprising or unexpected**.
- It is always a good idea to check with a person with **clinical expert** to assess and scrutinize the results.
- Analysts should always be skeptic of results when there is a **large gap** between the performances of training (after tuning) and test sets.
- Analyst should always care about **reproducibility and repeatability** of the analysis results (requires keeping good log of what parameter, hyperparameter, computational settings settings were used).
- Indeed internal validity is helpful, but the **external validity** should also be assessed when possible. Often the models developed under laboratory condition may not perform well under real-world clinical settings.