The motivation for lfsclean
is to develop a set of standard functions
for processing raw data from the quarterly Labour Force Survey (LFS),
which is an ongoing survey of households representative of the UK
population primarily collecting demographic and labour market
information such as employment status, earnings, and hours of work.
The lfsclean
package contains functions which read in the raw data
files, processes them into clean output variables, and combines all data
files into a single output data table. The functions also create
real-terms values for nominally valued monetary variables (earnings and
wages), allowing the user to select either the CPIH or RPI.
Note that due to data quality concerns (see below), the LFS is being replaced by the Transformed Labour Force Survey (TLFS). This survey will run concurrently with the Annual Population Survey (APS) and LFS during 2023, after which LFS and APS will be discontinued (see ONS updates). This R package cleans only original LFS data 1993-2023.
The Labour Force Survey (LFS) is a representative survey of the UK with the aim of collecting detailed labour market information. It is used to produce official statistics on unemployment. The survey has been conducted on a quarterly basis since 1992 with data collected in waves. Each sampled wave of participants remain in the survey for five consecutive quarters, with five waves participating in the survey at any one time and staggered such that each quarter one wave which has participated for the previous five quarters is replaced with a new wave of participants.
Note that there has been a long-running trend in the LFS data towards increasing non-response to the survey, particularly after wave 1. Between 2014 and 2020 the overall response rate in a given quarter declined from around 50% to nearer 40%. During 2020 quarter 2, the first full quarter during which Covid-19 restrictions were in force in the UK, the response fell dramatically to below 30%. While some attempt to improve response was made by increasing the wave 1 sample, the overall response rate has continued to decline since, to almost 15% in 2023 quarter 2. Analysis of data from the LFS, particularly from 2020 onwards, should consider the potential impact of these very low response rates, especially when conducting subgroup analysis.
You can install the latest version of lfsclean
from GitHub with:
#install.packages("devtools")
#install.packages("getPass")
#install.packages("git2r")
devtools::install_git(
"https://github.com/STAPM/lfsclean.git",
build_vignettes = FALSE
)
Cite this package as:
Morris, D (2023). lfsclean: An R package for cleaning UK Quarterly Labour Force Survey data. version [x.x.x]. University of Sheffield. https://doi.org/10.17605/OSF.IO/AHDNY
Some examples of projects making use of the lfsclean
package are:
- Input-Output modelling. Here the package is used to create full-time equivalent employment by sector and year to estimate the impacts of changing demand in different sectors on total employment.