Making sense of skills

Skills

Making sense of skills

Summary

There is no publicly available data on the skills that are commonly required in UK job adverts. As a result, there is very little understanding of skill shortages, skill specialities that exist in different regions in the UK or the skills required for given occupations.

Over recent years, we have worked on several projects that are all focussed on improving our understanding of skills. These include building data-driven skills taxonomies, extracting skills, analysing skill shortages, and describing the skill profiles of regions and occupations.

Overview

Despite having well-established taxonomies for defining occupations and industries, the UK does not have an accepted skills taxonomy. Furthermore, many groups, including policy makers, local authorities and career advisers do not have access to up-to-date information on the latest skills required by employers. This is because there is no publicly available data on the skills mentioned within UK job advertisements. This data gap means that these groups have a less-than-complete evidence base on which to inform labour market policies, address regional skill shortages or advise job seekers.

To address these gaps we have created two versions of data-driven skills taxonomies; the first starting from proprietary job advert and skills data, and the second with a job advert and skills dataset curated by us. These can be used to spot regional skill clusters, and for rapid assessments of skill changes following shocks such as the COVID-19 pandemic. We have also built an open source Python library to extract skills from job adverts and match them to a standardised skills taxonomy. This library is incorporated into a fully open demonstrator tool that allows anyone to extract the skills mentioned in a single advert and then match these to a standardised skills taxonomy. Finally, by applying our open-source Python library to a sample of 100,000 online job adverts, we have explored differences in regional and occupational skill demands.

Methods

Our skills taxonomies and skill extraction methods have used several machine learning techniques such as word and sentence embeddings, named entity recognition, dimensionality reduction, network community detection algorithms and semantic clustering.

In the first version of our skills taxonomy we modelled skills mentioned in job adverts as a graph and once skills were represented as a network, we hierarchically grouped them into clusters. Our second version of the skills taxonomy involved extracting skill sentences from job adverts using a supervised machine learning classifier, then clustering these skill sentences by their semantic similarity. Different numbers of clusters were used to create 3 taxonomy levels plus individual skills.

In our most recent project, our approach to extracting standardised skills from job adverts has two steps:

  1. The first step is extracting skills using a model that predicts the parts (“entities”) of a job advert that are skills. This model was trained using an existing Named Entity Recognition (NER) neural network architecture.
  2. The second step involves mapping the extracted skill entities from job adverts to an existing taxonomy of skills (we use the European Commission’s European Skills, Competences, and Occupations (ESCO) and Lightcast’s Open Skills). To do this, we find the semantically closest taxonomy skill to each extracted skill entity; for example, “Excel” might be mapped to the ESCO skill “use spreadsheets software”. Semantic closeness is found by numerically representing all skill entities and taxonomy skills using natural language processing techniques and then calculating the similarity between these numerical representations.

Findings

Our first version of the skills taxonomy provided estimates of the demand for each skill cluster, based on the number of mentions within adverts. Users can search the taxonomy by

job title and discover the skills needed for a wide range of jobs. We found that the five clusters containing the most frequently demanded skills are social work and caregiving, general sales, software development, office administration, and driving and automotive maintenance. The five skill clusters with the highest median annual salaries are data engineering, securities trading, IT security operations, IT security standards and mainframe programming.

 

In the second version of the skills taxonomy we explore two applications: identifying regional skill differences, and examining changes in skill demands following an exogenous shock, namely the COVID-19 pandemic. For the former we found, amongst other differences, that London has a high demand for skills in languages, management, IT and finance. For the latter application we found an increase in demand for health care skills, and a decrease in demand for service industry skills, which is consistent with the multiple lockdowns experienced during the COVID-19 pandemic.

In our most recent project, we evaluated our skills extraction algorithm to find that only 6% of the skill entities extracted were inappropriate to the job advert. Of the 94% appropriately extracted skill entities, 88% were also judged to be appropriately mapped to ESCO skills. As a result, we feel confident in the skills our algorithm extracts and believe that it is competitive with the results of other skill extraction algorithms.

In our analysis of 100,000 random online job adverts, posted online between January 2021 and August 2022, we produced several interactive visualisations. One of these was the most commonly requested ESCO skills; the top three of these are “communication”, “show positive attitude” and “show organisational abilities”. These interactive visualisations also allow users to pick an occupation and find the occupations which have the most similar set of requested skills. For example, the top three most similar occupations to an Account Manager are Business Development, Sales Executive and IT Sales. In another visualisation we can see regional skill intensities, where we can compare one region’s skill demands to another. For example, London has a greater than average demand for “software and applications development and analysis” and “developing financial, business or marketing plans”.

Impact

Our first version of the skills taxonomy received lots of interest. The project’s data visualisations, viewable here, have been shortlisted for an Information is Beautiful award in the Science and Technology category. The World Economic Forum report on ‘Strategies for the New Economy: Skills as the Currency of the Labour Market’ cited this skills taxonomy as one of the emerging initiatives for creating a skills-based labour market. Our work on occupations is being used by the ONS’s Classifications Team to inform their update of occupation classification codes. Google Digital Garage used our skills taxonomy to build their recently launched Profile Builder.

Our research team continues to receive many queries from stakeholders who wish to share lessons learnt and to explore opportunities for applying both taxonomies and replicating the research.

Both our skills extraction library and the tools we built on top of it are open source for others to use. Our analysis demonstrates  how our algorithm can be used to look at commonly requested skills in different occupations and regions. This could be useful to enrich careers advice and understand regional skill demand.

Our skills extraction algorithm has several strengths:

  1. It can extract skills that have not been seen before. For example, although the ESCO taxonomy does not contain the programming skill “React”, the model was able to detect from the sentence “You have Vanilla JavaScript skills (including React, Node and TypeScript)” that “React” was a skill, and it also mapped it to the ESCO skill “use scripting programming”.
  2. The library can be adapted to your chosen taxonomy. We have coded the library in such a way that you can map skills to a custom taxonomy if desired.

You can match to different levels of the taxonomy. This can be handy when a job advert mentions a broad skill group (e.g. computer programming) rather than a specific skill (e.g. Python).

Outputs

Data and Code

Skills Extractor Library Code is available here.
Skills Extractor demo tool is available here.
Skills Extractor Library Documentation is available here.

The new skills taxonomy using TextKernel data is available here.

The prototype skills taxonomy is available to access here.

Publications and Presentations

Gallagher, E. ‘Skills in Online Job AdvertsGEOINNO 2024, SS30. ‘The Library of Babel: Uncovering Knowledge in Unstructured Textual Data to Better Understand the Geography of Innovation 1, Manchester, 10-12 January 2024

Kerle, I., Gallagher, E. and Vines, J. ‘Building a skills extraction library using NLP toolsPyData London 2023, London, 2-4 June 2023

Sleeman, C. ‘Extracting drivers of job quality from online employee reviews’ ESCoE Conference on Economic Measurement 2023, Contributed Session F: Job quality and wages, King’s College London, 17-19 May 2023

Gallagher, E., Kerle, I., Sleeman, C. and Vines, J. ‘Extracting skills from online job advertisementsESCoE Conference on Economic Measurement 2023, Special Session A: Estimating industry and occupation exposure to emerging automation technologies, King’s College London, 17-19 May 2023

Gallagher, E., Kerle, I and Sleeman, C. “The Skills Extractor Library” ESCoE Blog, 13 March 2023

Gallagher, E., Kerle, I and Sleeman, C. “Exploring UK Skills Demand” 9 March 2023

Gallagher, E., Kerle, Sleeman, C. and Vines, J. “Extracting Skills from Online Job Advertisements in the Open Jobs Observatory” Economic Statistics Centre of Excellence: The Next Five Years, poster exhibition, 12 December 2022, One Birdcage Walk, London.

Gallagher, E., Kerle, I, Sleeman, C. and Richardson, G. (2022) “A New Approach to Building a Skills TaxonomyESCoE Technical Report Series, TR-16

Sleeman, C., Barnett, G., Kanders, K.,  Kerle, I.,  Klinger, J.,  Otubusen, A. and Jack Vines “Open Jobs Observatory” ESCoE Conference on Economic Measurement 2022, University of Strathclyde, Poster Exhibition 26 May 2022

Gallagher, E., Kerle, I, Sleeman, C. and Richardson, G. “Building a skills taxonomy for the UK” ESCoE Blog, 11 May 2o22

Gallagher, E and Kerle, I. “A New Approach to Building a Skills TaxonomyHuman Capital Workshop: Exploring Skills and Education, 22 February 2022 (video at 29:06)

Garasto, S., Djumalieva, J., Kanders, K., Wilcock, R. and Sleeman, C. (2021) “Developing experimental estimates of regional skill demandESCoE Discussion Paper Series, ESCoE DP 2021-02

Garasto, S. “Boosting labour market intelligence for local decision-makers” ESCoE Blog, 8 March 2021

Garasto, S.”Boosting labour market intelligence for local decision-makers” Nesta Blog, 8 March 2021

O’Mahony, M., Rosenfeld, D., Sleeman, C., Turrell, A. and Vassilev, G. “Exploring Online Job Vacancy Data” ESCoE Blog, 28 January 2021

Kanders, K. “Mapping career causeways for workers displaced by automation and COVID-19” ESCoE COVID-19 Webinar Series, 10 December 2020

Sleeman, C. “Expanding and Enriching LMI” ESCoE Workshop on Online Job Vacancy Data, 9 Dec 2020

Djumalieva, J., Sleeman, C. and Garasto, S. “Evaluating a New Earnings Indicator. Can we Improve the Timeliness of Existing Statistics on Earnings by Using Salary Information from Online Job Adverts?ESCoE Conference on Economic Measurement 2020, Contributed Session B: Nowcasting, 16-18 Sep 2020

Sleeman, C. “Lessons Learnt in Analysing Job AdvertsESCoE Conference on Economic Measurement 2020, Special Session G: Using On-line Vacancy Data for Policy Research, 16-18 Sep 2020

Djumalieva, J., Garasto, S. and Sleeman, C. (2020) “Evaluating a new earnings indicator. Can we improve the timeliness of existing statistics on earnings by using salary information from online job adverts?ESCoE Discussion Paper Series, ESCoE DP 2020-19

Djumalieva, J. and Sleeman, C. “An open and data-driven taxonomy of skills extracted from online job adverts” Economic Measurement with Big Data Special Session, Royal Economic Society Annual Conference 2019, University of Warwick, 16 April 2019

Djumalieva, J. and Sleeman, C. (2018) “An Open and Data-driven Taxonomy of Skills Extracted from Online Job AdvertsESCoE Discussion Paper Series, ESCoE DP 2018-13

Djumalieva, J. and Sleeman, C. “The first publicly available data-driven skills taxonomy for the UK” ESCoE Blog, 22 August 2018

Lima, A. and Bakhshi, H. (2018) “Classifying Occupations Using Web-Based Job Advertisements: an Application to STEM and Creative OccupationsESCoE Discussion Paper Series, ESCoE DP 2018-08

Lima, A. and Bakhshi, H. “Classifying STEM and Creative Occupations Using Online Job Ads” ESCoE Blog, 2 July 2018

Djumalieva, J., Lima, A. and Sleeman, C. (2018) “Classifying Occupations According to their Skill Requirements in Job Advertisements” ESCoE Discussion Paper Series, ESCoE DP 2018-04

Sleeman, C. and Djumalieva, J. “Linking skills to occupations – Using big data to build a new occupational taxonomy for the UK” ESCoE Blog, 29 March 2018

Sleeman, C. “The UK Needs a Skills Map” ESCoE Blog, 15 September 2017

People

Jyldyz Djumalieva

Stef Garasto

George Richardson

Partners

Related Publications

Related Events