Data Scientist
mediumds-missing-data

How do you handle missing data and when is imputation risky?

Answer

First understand why data is missing (MCAR/MAR/MNAR). Options: - Drop rows/columns (if small impact) - Impute (mean/median, KNN, model-based) - Add missingness indicators Imputation is risky when missingness is informative (MNAR) or when it creates false confidence. Always validate downstream impact.

Related Topics

Data CleaningStatisticsData Science