Predictive mean matching
Appearance
Part of a series on |
Machine learning and data mining |
---|
Predictive mean matching (PMM)[1] is a widely used[2] statistical imputation method for missing values, first proposed by Donald B. Rubin in 1986[3] and R. J. A. Little in 1988.[4]
It aims to reduce the bias introduced in a dataset through imputation, by drawing real values sampled from the data.[5] This is achieved by building a small subset of observations where the outcome variable matches the outcome of the observations with missing values.[1]
Compared to other imputation methods, it usually imputes less implausible values (e.g. negative incomes) and takes heteroscedastic data into account more appropriately.[6]
References
[edit]- ^ a b "3.4 Predictive mean matching". stefvanbuuren.name. Retrieved 30 June 2019.
- ^ "Web of Science [v.5.32] – All Databases Results". apps.webofknowledge.com. Retrieved 30 June 2019.
- ^ Rubin, Donald B. (30 June 1986). "Statistical Matching Using File Concatenation with Adjusted Weights and Multiple Imputations". Journal of Business & Economic Statistics. 4 (1): 87–94. doi:10.2307/1391390. JSTOR 1391390.
- ^ Little, Roderick J. A. (30 June 1988). "Missing-Data Adjustments in Large Surveys". Journal of Business & Economic Statistics. 6 (3): 287–296. doi:10.2307/1391878. JSTOR 1391878.
- ^ "Imputation by Predictive Mean Matching: Promise & Peril – Statistical Horizons". statisticalhorizons.com. Retrieved 30 June 2019.
- ^ "Predictive Mean Matching Imputation (Example in R)". Statistics Globe. Retrieved 2020-09-18.