Exploring Online Job Vacancy Data

Edited by Mary O’Mahony, David Rosenfeld, Cath Sleeman, Arthur Turrell and Gueorguie Vassilev

In partnership with the Department for Business, Energy and Industrial Strategy (BEIS), ESCoE held a virtual workshop on online job vacancy data on 9 December 2020.

This workshop brought together representatives from the growing community of those who are using online vacancy data for economic measurement (academics, ONS, government and practitioners) and those that provide the data. The workshop was held with the aim of sharing knowledge and experience around using online job vacancy data for economic measurement and analysis; identifying common issues and opportunities for collaboration; and to understand current policy needs. The day included four sessions with the first three sessions each featuring a series of presentations followed by a discussion. In the final session, a panel discussion brought together the different strands from earlier in the day and explored the emerging opportunities and challenges for working with this kind of data.

This blog outlines the content of each of the sessions, and, together with the recordings and slides made available on the ESCoE website here, is intended to serve as a resource for those wishing to engage with online job vacancy data going forward.

Session 1. Understanding and applying vacancy data

James Neave (Adzuna) discussed the wealth of insights that can be extracted from job adverts. These insights can be used to evidence issues ranging from working patterns to regional development. James noted that particular value lies in the descriptions of roles within adverts. By using natural language processing, the descriptions can provide insights on areas such as company culture, educational requirements and gender bias.

Elodie Andrieu (King’s College London) discussed a new project that is exploring whether technological readiness played a role in determining the resilience of companies to the initial impact of COVID-19. Elodie and co-authors are combining job adverts from Burning Glass with business activity data and novel indicators such as the prevalence of cloud computing. Perhaps surprisingly, initial results showed that on average, small firms increased the number of adverts that they posted after the pandemic began. Elodie explained that this result is likely driven by the pandemic forcing already-struggling firms to go out of business which in turn caused them to drop out of their dataset.

Pawel Adrjan (Indeed) demonstrated the potential of adverts to provide timely and granular insights into labour market developments. Based on Indeed’s own data, the number of job postings are still around 40% lower than where we would expect at this time of the year. Pawel also showed the large differences between sectors, both in pay trends and in the number of clicks that an advert receives. For example, medical technicians have experienced a 12% increase in average advertised pay (year-on-year), while those in food preparation and service have experienced a 7% decline.

Grant Telfer and Bauke Visser (Textkernel) demonstrated their Skills Explorer tool. This is a 3D interactive data visualisation that shows the relationship between skills. Grant and Bauke have measured skill similarity by using the co-occurence of skills within job adverts. At a time when many displaced workers are looking to enter new sectors, tools such as these could help to show job seekers the transferability of their skillset.

View video recording here.

Session 2. Job vacancy data for the public good

Cath Sleeman (Nesta) told us about plans to create an ‘Open Jobs Observatory’, in partnership with the Department for Education, to deliver open, up-to-date and actionable insights on UK skill demands. The code to clean and extract insights from the adverts will be open-sourced. Cath also presented Nesta’s latest report – Mapping Career Causeways – in which the team used machine learning to create a map that captures the similarities between over 1,600 jobs, based on the skills and work activities that make up each role. The map can be used to inform careers advice by showing how a worker’s skills and experiences can be transferred to another role. The code for this project has recently been released and is available here.

David Rosenfeld (BEIS) told us about how BEIS used job vacancies data to monitor the labour market in near real-time as coronavirus spread through the UK. Interestingly, he noted that the extent to which job vacancies have recovered as suggested by Burning Glass data differed from the estimates from other sources such as Indeed and Adzuna. David’s team had also examined the extent to which phrases related to remote working had become more common in job vacancies; up 5 percentage points since the start of the March 2020 lockdown. In another study, they used job vacancy data to track the mentions of a large-scale nuclear plant, Hinckley Point C, as a proxy for direct and indirect job creation from the project. The marginal skill requirement in these jobs was for nuclear energy expertise, but there were a range of ‘spillover’ vacancies that appeared across the country. In a third project David and his colleagues tried to track the growth of new technologies, such as cloud computing and deep learning, through vacancies, and were able to benchmark results for AI skills by searching for these skills in GitHub repository readme files (a publicly accessible website where programmers can lodge information about their projects).

Khloe Evans (ONS) set out the ONS’ ambitions for online job vacancy data, building on their current releases of Adzuna data. She explained that they want to establish consistent cross-government methods to address the well-known bias and coverage issues of job vacancy data; to develop coherent methodologies for processing job ad data into useful statistics; and to facilitate the linking of skills demand to other datasets that cover the labour market.

Lea Samek (OECD) presented on her and her colleagues’ project that measures AI jobs and AI-related skills using Burning Glass data. She told us that they had undertaken work on the representativeness of Burning Glass data and recommended re-weighting it for some applications. Lea showed us how AI jobs had diffused around the UK, beginning in London and Cambridge, but more recently in other urban areas too—and, through judicious use of network diagrams, she explained that neural networks were the topic that was most central to AI skills, according to job ad data.

View video recording here.

Session 3. Job vacancy data for academic use

Sally-Anne Barnes and Jeisson Cárdenas-Rubio (Institute for Employment Research, University of Warwick) presented on an initiative being developed at the IER to construct labour market information based on online vacancy data, as part of a wider project on creating an online data portal which brings together high quality data – LMI for All. They are currently web-scraping various job portals such as Reed, jobs.ac.uk etc. Jeisson discussed their approach and the issues arising at occupation, industry and regional levels and in defining skills.

Dafni Papoutsaki (Institute for Employment Studies) discussed their use of Adzuna data to derive accurate and timely data for policy in dealing with the consequences of the COVID-19 pandemic. The presentation focused on the time pattern of flows and stocks of vacancies and on regional disparities. This showed significant variation in the impact of the crisis on labour markets across local authorities. They also discussed the need for systematic approaches to retrieve good quality information on job vacancies, taking account of occupation, earnings and types of contracts and how these relate to who posted the advert.

Bledi Taska (Burning Glass Technologies) presented on job opportunities in the UK during the crisis, using their online platform data. The focus was on skill adjacency of jobs based on tasks performed and overlap in terms of education, experience, and skills required. They looked at possible pathways from jobs whose demand has declined to those experiencing growth and issues related to longer term career mobility. This analysis provided useful insights for policy on re-skilling of the workforce.

Matthias Qian (University of Oxford) presented on work being undertaken at Oxford on flexible job arrangements. The focus of his presentation was on low wage jobs using job vacancy data from Burning Glass Technologies. He argued that machine learning techniques such as text analysis provide important new sources of information above those provided by traditional surveys. The analysis showed that flexible hours jobs were increasing in the UK and showed evidence that this was due to a desire of employers to reduce labour cost for low skilled workers. The presentation also showed a post COVID-19 surge in flexible jobs advertisements for low paid workers.

View video recording here.

Session 4. Job vacancy data and policy needs: Looking at the future

Chaired by Gueorguie Vassilev (ONS), the panel included Dan Mawson (BEIS), Mariagrazia Squicciarini (OECD) and David Freeman (ONS). Gueorguie opened the discussion by asking what the panel thought the opportunities of online job vacancy data were, who the beneficiaries of using this data include, and what obstacles may exist to using the data.

Opportunities

Three key opportunities emerged from the discussion: timeliness, granularity and richness. The speed with which users can access these data is exceptional. A criticism of traditional jobs vacancies survey data, and indeed much other survey data, is that it can take a long time to be made available for end users. Dan Mawson described online jobs vacancies data as being almost incomparably fast compared to other mainstream economic statistics that we work with. That speed has been especially useful this past year, with the challenges of the COVID-19 pandemic meaning that governments need to know about emerging trends and what is happening on a day-by-day, week-by-week basis. These data are not just fast, but also extremely detailed. Data are virtually available at point-in-time and in real-time, and with lots of granularity. There is a wealth of detail that is very difficult to obtain through using surveys such as the monthly ONS Vacancy Survey or other survey-based statistics. The significantly improved granularity in these data can allow us to look into individual post-by-post vacancies themselves. Although the ONS Vacancy Survey can provide quite a good indication of the total number of vacancies, it only allows users to segment that data by industry and size of business. Online job vacancy data can be segmented by so many more parameters, such as geography, occupation skills level, sector, salary band and many others. As such, this type of data with its inherent flexibility, enables users to very easily reach out to an extensive range of associated topics for cross-cutting analysis, including, but not limited to, job vacancies.

For Mariagrazia, the ‘goldmine’ in these data is in the linkages to other datasets. Bringing new methodologies to bear, we can make the best use of online job vacancy data by linking to data such as those from firms, international property rights, technological development and trade to better understand global value chains, for example. Doing so can help us look in a much deeper way at the nature of jobs, skills and labour market dynamics. We can ascertain which skills employers are demanding, as well as the types of jobs and tasks they are matched with. The demand side of the labour market is not a strength of other statistical coverage. It’s rare to have information on what firms really want and need at a point in time. Typically, this is very firm-specific, specific to trade conditions, and so more granularity is key. We know for example that human capital is a necessary complement to any form of technological deployment. These data can help us to understand whether occupational profiles are changing with that deployment.

Beneficiaries

The panel was unanimous in their identification of a wide range of beneficiaries of online job vacancy data. The data has relevance and applications across all government, whether at local, regional or national levels, and across different departments. Referencing the ONS Vacancy Survey, David highlighted that, as one of the few leading indicators published on a monthly basis showing changes in demand in the labour market, job vacancy data are always in demand. The online job vacancy data provides policy makers with the opportunity to readily examine the labour market at a more local level. This information can then help better determine what policy interventions are needed, improving a range of metrics from service delivery outcomes to the targeting of resources and cost effectiveness.

Beyond governments and policy makers, further key identified beneficiaries include companies, education providers and labour market analysts. Companies are able to utilise the data to quickly and easily identify the latest trends in the jobs market. They can see what jobs are being offered by their market competitors, identify new and emerging skills, and determine whether the overall profiles of the employees they are seeking are changing over time, as well as understand what kind of talent they can tap into. Vacancies are a literal sign of what firms are doing and what employees want, or at least what firms think employees want. For education providers, especially those with a focus on their local geographic area, understanding what the demand is for skills locally means that they are better able to tailor their courses to meet that demand. Students then benefit from training in the skills that they need to secure employment where they live. Dan identified online job vacancy data as having particular value for labour market analysts because it ‘makes us think differently about the world’. It encourages and requires users to learn new analytical techniques to obtain new understanding and it forces us to think more about how labour markets work. In fact, there are benefits to policy makers beyond labour market and skills demand.

Obstacles

A number of obstacles or challenges to using these data exist, but our panel were of the opinion that these are not insurmountable. Key to dealing with such obstacles is paying careful attention to how we perceive and use the data. Mariagrazia emphasised particularly strongly what is perhaps one of the most critical of these obstacles, representativeness. Despite the opportunities outlined above, at least at this moment, online job vacancy data cannot substitute for the ‘conventional’ job vacancy data collected by statistical offices because there are some job categories that are not currently advertised online, so the online data should instead be seen as a complement to conventional data. The scope of the online job vacancy data is consequently narrowed and there is not the level of representativeness that many users might want to have. We have to be very aware of what these data can tell us and what they cannot. A frustration for Mariagrazia is ‘not knowing how the story ends’, and the difficulty of matching the job adverts with employees. Job adverts do not tell us how many people applied for a position, or who was hired. For users investigating salaries for instance, we don’t know if what is being offered in the job advert corresponds with what is actually being paid to workers. We might need to pay more attention to perks and benefits beyond the salary figure, as Adzuna alluded to in their talk. Gaps in data sources make matching difficult.

A key question then is how to deal with this representativeness problem? Echoing the sentiments of the other panelists, Dan stated that the real long-term value of online jobs vacancy data is triangulating with other data to see how it all knits together. Unlike VAT data for example, there’s not quite one definitive source for jobs vacancy data, and we need to use a combination of data sources to help get the fullest, most accurate picture. Even when just staying within the online job vacancy data context, there is no one definitive data source. A number of key data hubs exist, Adzuna, Indeed and Burning Glass Technologies. Which one is giving you the ‘right’ answer? When we have different versions of the ‘truth’, with differing gaps (e.g. in coverage), how do we link this type of data into national statistics in the long-term to become a core data repository? A previously considered suggestion is to weight the microdata to national sources. David also raised how representation is not always a problem. At the macroeconomic level, there is a bigger onus on comparability with other national sources, while at skill level, this is less problematic. As Mariagrazia said, “free text is not a fantastic thing”, so it makes extraction of some insights problematic, but does not necessarily cause representativeness issues.

Adzuna, Indeed and Burning Glass Technologies are private companies. This poses certain issues for government users. Firms selling job vacancy data may want to return a profit. If costs are involved with accessing data, can public bodies sustain those costs? Additionally, there is vulnerability to the changes in the market value of the data, and in the underlying data that users do not have control over. Costs are not limited to merely accessing the data. For all users, public and private, there are additional costs for suitable IT infrastructure. In order to process and analyse the data, especially across countries, significant data infrastructure and support is required. Human capital is of course also needed, and, despite the profession growing, there is currently a shortage of data scientists.

The panel also considered whether there are impacts to official sources from such commercial data. David suggested that the commercial data could make us think about the coverage of certain industries and could be used for cross-validating developments during the pandemic. One needs to consider what the data is capturing during the pandemic – for example, if it is the same companies asking for different skills, that has quite different implications to if it is new companies that are becoming more digital. The data could also signal a more fundamental shift, like types of firms shifting.

View video recording here.

(Please note that the content of this blog represents the views of the editors and not necessarily the views of their institutions.)

Mary O’Mahony is Professor of Applied Economics at King’s Business School, King’s College London and a Research Associate at ESCoE
David Rosenfeld is an Economic Adviser at the Department for Business, Energy and Industrial Strategy (BEIS)
Cath Sleeman is Head of Data Visualisation at Nesta and a Research Associate at ESCoE.
Arthur Turrell is a Senior Research Economist at the Bank of England
Gueorguie Vassilev is Head of Economic Well-being at the Office for National Statistics (ONS)

ESCoE blogs are published to further debate. Any views expressed are solely those of the author(s) and so cannot be taken to represent those of the ESCoE, its partner institutions or the Office for National Statistics.

Exploring Online Job Vacancy Data

Edited by Mary O’Mahony, David Rosenfeld, Cath Sleeman, Arthur Turrell and Gueorguie Vassilev

Session 1. Understanding and applying vacancy data

Session 2. Job vacancy data for the public good

Session 3. Job vacancy data for academic use

Session 4. Job vacancy data and policy needs: Looking at the future

About the authors

Mary O'Mahony

Cath Sleeman

Research Projects

Making Sense of Skills

Using Administrative Data to Measure New Forms of Working

Using Administrative Data to Develop New Labour Force and Migration Statistics

Related publications

Evaluating a new earnings indicator. Can we improve the timeliness of existing statistics on earnings by using salary information from online job adverts? (ESCoE DP 2020-19)

An Open and Data-driven Taxonomy of Skills Extracted from Online Job Adverts (ESCoE DP 2018-13)

Classifying Occupations Using Web-Based Job Advertisements: an Application to STEM and Creative Occupations (ESCoE DP 2018-08)

Classifying Occupations According to Their Skill Requirements in Job Advertisements (ESCoE DP 2018-04)