Data Scientist
mediumds-missing-data
How do you handle missing data and when is imputation risky?
Answer
First understand why data is missing (MCAR/MAR/MNAR).
Options:
- Drop rows/columns (if small impact)
- Impute (mean/median, KNN, model-based)
- Add missingness indicators
Imputation is risky when missingness is informative (MNAR) or when it creates false confidence. Always validate downstream impact.
Related Topics
Data CleaningStatisticsData Science