Analysing the longitudinal aspects of the UK Structural Earnings Survey
Jul 27, 09:00
Although most UK business surveys only have an element of longitudinal design, the UK’s structural earnings survey ASHE (Annual Survey of Hours and Earnings) has a true longitudinal design. The design has remained largely unchanged since its inception as the New Earnings Survey in the 1970s though methodological improvements, such as calibration, were introduced when it was launched in its current form in 2004.
The survey design is based on an un-clustered sample of employees drawn directly from administration tax records which are matched with employers through the business register. The sample is a one per cent sample of all employees who are within a Pay As You Earn (PAYE) scheme registered with Her Majesty’s Revenue and Customs (HMRC). The longitudinal aspect is induced by using the same unique reference numbers within the schemes each year. Information on employees’ pay, hours worked, pension contributions, job descriptions etc. is obtained directly from employers via the survey.
Pressure to use administrative data to replace some survey data and improve survey design, and expected improved access to regular employee tax data have led to extensive analysis using over thirteen years of longitudinal ASHE data. Although the administrative data can potentially improve the quality and frequency of median pay estimates, it is unlikely to provide good quality replacement data for many of the important variables (such as hours worked) used in estimating pay rates and providing detailed breakdowns by job types. Since the administrative data will potentially contain high quality pay information, which should correspond well with the survey information, exploratory regression analysis has used the survey pay variable and other key variables with appropriate covariates.
Further analysis on a matched set of administrative data for the latest three years has explored response bias issues, since non-responders’ pay, industry of business and some demographic characteristics are available through the linked administrative data. Regression analysis between the survey and administrative data pay values for survey responders has also been completed.
The full analysis of variables not yet used in official statistical outputs has revealed new insights into how commuting distances and demographic factors relate to employee remuneration and how relationships have changed over time.
We will present the lessons learnt and recommendations for the future design of a fully integrated administrative and survey data design to produce better statistics to inform labour market policies as well as the interesting by-products of this work.