Composite regression estimation for repeated business surveys with zero-inflated variables
Jul 25, 09:55
In longitudinal surveys, composite regression estimation is a common extension of the GREG (generalized regression) estimator that makes use of a partial overlap of samples in subsequent survey waves. The basic idea is to use estimated totals of key survey variables in previous waves as additional calibration totals for the current wave. Since most survey variables are positively auto-correlated over time, calibrating to previous estimated totals has the potential to reduce the standard error of estimates of change. For survey units in the “overlap” sample the values of key variables in previous waves are known, whereas this information is missing by design for new survey units (the “birth” sample). For these units, the previous values have to be imputed before this “composite” auxiliary variable can be added to the calibration variables. Usually, some kind of mean imputation is applied in this setting.
We investigate composite regression estimation for job vacancy surveys, where the key survey variable (the number of job vacancies) is zero-inflated. We argue that mean imputation is sub-optimal for a zero-inflated variable and we compare different imputation strategies in terms of their effect on the variances of estimates of level and change resulting from the composite regression estimation.