Using Text Data to Improve Industrial Statistics in the UK (ESCoE DP 2022-01)

cube-no-animation-1

Using Text Data to Improve Industrial Statistics in the UK (ESCoE DP 2022-01)

By Alex Bishop, Juan Mateos-Garcia, George Richardson

Go to next section

We use business website data to explore the limitations of the Standard Industrial
Classification taxonomy and develop a prototype for a bottom-up industrial taxonomy
based on semantic similarities between company descriptions. This prototype makes it
possible to decompose uninformative SIC codes into granular industries, build user-driven
industry groups which might be of interest to policymakers (e.g. ‘green economy’) and build
indices of local economic composition that are more strongly associated with local
economic performance than those based on the SIC taxonomy. We consider potential
avenues to combine official and bottom-up taxonomies in order to improve our
understanding the economy and inform economic policy.