Senin, 03 Desember 2012

Hot Deck Imputation for Missing Values

BY ALBERT ANTHONY D. GAVINO

Hot Deck Imputation

This method sorts respondents and non-respondents into a number of imputation subsets according to a user-specified set of covariates. An imputation subset comprises cases with the same values as those of the user-specified covariates. Missing values are then replaced with values taken from matching respondents (i.e. respondents that are similar with respect to the covariates).
If there is more than one matching respondent for any particular non-respondent, the user has two choices:
  1. The first respondent’s value as counted from the missing entry downwards within the imputation subset is used to impute. The reason for this is that the first respondent’s value may be closer in time to the case that has the missing value. For example, if cases are entered according to the order in which they occur, there may possibly be some type of time effect in some studies.
  2. A respondent’s value is randomly selected from within the imputation subset. If a matching respondent does not exist in the initial imputation class, the subset will be collapsed by one level starting with the last variable that was selected as a sort variable, or until a match can be found. Note that if no matching respondent is found, even after all of the sort variables have been collapsed, three options are available:
Re-specify new sort variables
  • The user can specify up to five sort variables.
Perform random overall imputation
  • Where the missing value will be replaced with a value randomly selected from the observed values in that variable.
Do not impute the missing value
  • SOLAS will not impute any missing values for which no matching respondent is found.