Abstracts

Methodology of Longitudinal Surveys II
-

Register

Proper multiple imputation of clustered or panel data

My Programme   Add 
Type:Monograph Paper
Date:
Jul 26, 11:00
Room:LTB6
  • Martin Spiess - University of Hamburg, Department of Psychology
  • Kristian Kleinke - University of Bielefeld, Department of Psychology
  • Jost Reinecke - University of Bielefeld, Faculty of Sociology

Allison (2001) states that the best solution to the missing data problem is prevention. This is especially true for complex data sets like clustered or panel data. Panel data are a subclass of clustered data, and both can be analyzed adopting multilevel models. Missingness may occur at various levels: in the outcome variable(s), in level-1 predictors, level-2 predictors, or even higher levels, and finally even in the group identifier(s). Many researchers still handle missingness (e.g. in multilevel data in level-1 and level-2 predictors) by excluding the incomplete cases from the analysis – a wasteful practice, which may lead to biased inferences. On the other hand, also none of the currently existing multiple imputation solutions for complex data can be described as optimal, as they either rely rather heavily upon strong distributional assumptions, often including homoscedasticity, which are frequently violated in “real life” situations. On the other hand, non- or semiparametric imputations methods often lack justification. Recent papers that contrast and review various strategies to impute complex clustered or panel data are Kleinke, Stemmler, Reinecke, and Lösel (2011), Drechsler (2015), Enders, Mistler, and Keller (2016), Grund, Lüdtke, and Robitzsch (2016), and Lüdtke, Robitzsch, and Grund (2017). Shortcomings of some imputation techniques or consequences of misspecifications even in simple data sets are considered, e.g. in de Jong, van Buuren and Spiess (2016) or He and Raghunathan (2009). All in all, missing data in complex data structures and  specifically in panel data sets is a field where a lot of research still has to be done. Feasible and robust software solutions need to be developed that allow valid inferences, even when empirical data do not exactly follow the convenient statistical distributions assumed by the respective procedures  (e.g. de Jong, van Buuren and Spiess, 2016).

The purpose of this paper is (a) to give an overview of recent research on multiple imputation of incomplete clustered or panel data, (b) to discuss advantages, and disadvantages of the respective approaches, and (c) to provide practical guidelines, which imputation technique supposedly works best in a given scenario. To this end, we present results of various Monte Carlo simulations, in which we investigate the consequences of misspecified imputation models on inferences in multilevel models. In particular, we consider distributions of the covariates that differ in skewness and curtosis, or ignorable missing mechanisms that differ in their selectivity.

Twitter

Latest tweets from @MOLS2Essex. Follow the conversation at #MOLS2

Today is the last day for submitting abstracts on longitudinal survey methods for the special edition of Longitudin… https://t.co/MYNRQh3bpE
2 months 3 weeks ago
4 days left! If you want to submit a paper for the Special Issue of Longitudinal and Life Course Studies you have u… https://t.co/F35Nzqp7U1
2 months 3 weeks ago
If you want to submit a paper for the Special Issue of Longitudinal and Life Course Studies you have until 24 Sept.… https://t.co/y0chY6856z
3 months 2 days ago
RT : Call for papers: Understanding social dynamics: 20 years of the Swiss Household Panel. The Swiss Journal of Sociolo… https://t.co/k2DJvFDA49
3 months 1 week ago