Linking skills to occupations – Using big data to build a new occupational taxonomy for the UK

By Dr Cath Sleeman and Jyl Djumalieva

In our ESCoE discussion paper published this week, we’ve developed the first data-driven skills based taxonomy, or classification, of UK occupations. By linking skills to occupations, we hope the taxonomy will be of direct use to policy makers and employers.

The skills we need for work are changing

From automation to climate change, and from globalisation to our ageing population, there are a myriad of factors changing the nature of work in the UK. These factors mean that the majority of workers are in occupations with highly uncertain futures (Bakhshi, Downing, Osborne and Schneider, 2017).

Amidst this changing landscape, policymakers, educators, businesses and individuals need timely information on how occupations are changing and how they can help workers to transition out of at-risk occupations where skills are becoming redundant. To generate these insights we need a framework that links skills to occupations.

The current occupational taxonomy does not map cleanly to skills

The main grouping (or hierarchical taxonomy) of occupations in the UK is called the Standard Occupational Classification (SOC). The SOC taxonomy assigns jobs to 9 major occupation groups, and then repeatedly splits each group three further times. The fourth layer contains 369 different occupation groups.

While the stability of SOC makes it ideal for reporting labour market statistics, the grouping is not particularly well suited for understanding skills. That’s because the initial split of jobs into 9 major groups is based on differences in education and training levels (i.e. skill level), rather than on differences in the types of skills that these jobs require (skill specialisation).

The initial emphasis of SOC on skill level means that jobs that require very similar skills can appear in completely different major groups. In turn it can be difficult to map skill domains onto SOC and to understand how changes in skill demands are affecting occupations. The structure of SOC also means that workers may jump between different major groups over their careers as their skill levels rise.

Expert curated taxonomies can be slow to adapt

Most taxonomies of occupations, like SOC (for the UK), ESCO (Europe) and O*NET (USA), are created through a process of consultation with experts. Keeping these taxonomies up to date can be resource-intensive and, as a result, they are often only updated periodically. At present SOC is revised once every 10 years. Over such a period, the landscape for some occupations may change significantly, like it did for IT professionals between 2000 and 2010, necessitating the addition of new occupations to the UK SOC. We need a more timely way of capturing information on occupational dynamics.

Big data, in the form of job adverts, can help

Online job adverts, and the skills mentioned within these, can help us to develop an alternative taxonomy of occupations. Adverts provide detailed information on the skills required in different jobs. We can then group jobs into occupations that require similar skills.

The strength of job adverts is that they provide a near real-time source of information on skills. And compared to skill surveys, the skills in job adverts are typically more granular as the adverts give employers the freedom to directly describe their skill needs. That said, adverts do have limitations such as imperfect representativeness of the underlying occupations and a bias towards high-skilled professional occupations.

An emphasis on openness

We’re not the first to investigate the potential of online vacancy data. But to date, efforts have been concentrated largely in the private sector, by the likes of labour analytics companies, job search engines and recruitment agencies. While their research provides useful insights on methodology, the resulting occupational classifications remain proprietary. We are committed to sharing our methodology and, once finalised, the resulting taxonomy will also be shared publicly along with the algorithm used to generate it.

Our new data-driven taxonomy of UK occupations

The taxonomy that we have built is based on 37 million UK online job adverts provided by Burning Glass Technologies. To cluster the job adverts into groups we used a range of machine learning methods such as document clustering and word embeddings.

Like SOC, our taxonomy contains four hierarchical layers. But unlike SOC, our first three layers group jobs that require similar types of skills. This allows us to automatically recommend occupations to individuals based on their skill capabilities. The fourth layer of the hierarchy distinguishes between jobs based on the offered salary and indicates skill level. Incorporating skill level allows us to measure an individual’s career progression within the same skill domain.

Using the taxonomy

Over the next six months we’ll be working to show how the taxonomy, which links occupations to skills, can be applied to learn more about skill needs in the UK. As one example, we’ll be showing how the methodology can help us to identify new sets of skills and new occupations.

At the same time we’re also building a skills taxonomy, based on skill co-occurrence in job adverts. The UK doesn’t currently have a taxonomy of skills, and the new skills taxonomy could be used to produce timely information on the demand for, and the return on (i.e. salary), different groups of skills. These insights can then be used by policymakers to prioritise investment in skill development.

More broadly, we hope our work shows how naturally occurring big data, such as online job adverts, can be used to build a smarter labour market.

Dr Cath Sleeman is the Quantitative Research Fellow at Nesta. Jyl Djumalieva is a Data Science Research Fellow at Nesta

Thursday, March 29, 2018

TwitterFacebookLinkedInEmailPrint