Methodology of Longitudinal Surveys II

This site is now archived.

Estimation based on longitudinal data with informative missing

Type:Monograph Paper
Jul 26, 11:00
  • Zahoor Ahmad - University of Southampton
  • Li-Chun Zhang - University of Southampton

Longitudinal studies are common in medicine, psychology and sociology etc. A key strength is the ability to measure change in outcome over time. However, for various reasons, missing data is likely to occur to one or more of the sequence of measurements from the same individual. When the probability of missing depends on the unobserved value of the outcome, the missing mechanism is said to be nonignorable or informative. A similar definition applies also to sampling, as an additional initial step of the ‘missing’ process. The analysis of longitudinal data with informative missing values has received serious attention in the last thirty years.  Typically, fully parametric models are developed for both the missing-data mechanism and the outcome variable of interest, which requires specific functional assumptions that may be cumbersome to formulate and prone to misspecification. Estimation may be complicated numerically, or even infeasible with large datasets.

In this work we develop a new non-parametric estimation equation approach to estimation based on longitudinal data. To accommodate the potentially informative missing data, each unit is allowed its own unknown observation propensity, including the case where the units are selected initially under informative sampling, or complex sampling designs otherwise. The outcome values are also treated non-parametrically as constants, just like in the design-based approach to survey sampling. Under this set-up, the observation propensity is estimated using individual-specific observation history. The estimating equation based on these estimated observation propensities can then be used to estimate cross-sectional parameters, or parameters that are defined over time, such as the change between two successive time points or the regression coefficients involving outcomes over time. Compared to alternative fully or semi-parametric approaches, our approach is simple in construction and easy in computation. We prove that the estimator is consistent under suitable regularity conditions and develop suitable methods of variance estimation. The theoretical properties of such a non-parametric estimating equation approach have not been established previously in the literature, nor is the approach known to have been applied in practice.

We will further extend this approach for multiple variables of interest at each wave where we can deal with item nonresponse variable-by-variable, but we will investigate the possibility of a two-phase extension of our approach, where the response probability of an item is given as the product of unit response probability and the conditional item response probability. We are also looking to apply the approach to real data, such as the LFS with rotating panel design or short-term business panel surveys.


Latest tweets from @MOLS2Essex. Follow the conversation at #MOLS2

Congratulations & happy new year 🌹
15 hours 12 min ago
Quick off the mark! I haven't finished dig… https://t.co/eVfyoY8Gxy
18 hours 12 min ago
My first publication of 2019!! Assessing the reliability of longitudinal study data Findings from… https://t.co/UZnHihQjlG
18 hours 18 min ago
Today is the last day for submitting abstracts on longitudinal survey methods for the special edition of Longitudin… https://t.co/MYNRQh3bpE
3 months 1 week ago