Alternative data sources such as web-scraped and point of sale scanner price datasets are becoming increasingly available to National Statistical Institutes (NSIs). They offer large, timely and granular sources of price data which can be used to aid the more accurate calculation of consumer price indices. NSIs around the world are accordingly making ever more use of these data sources. The adoption of these new datasets does however present a number of inherent measurement issues that need to be addressed to facilitate optimal calculation of consumer price inflation. This project investigated and evaluated the different methods for doing this.
The use of scanner and web-scraped data has many advantages, allowing for larger sample sizes and potentially allowing NSIs to publish indices at higher frequency. However, the use of this data also creates new challenges. Scanner data potentially allows many more products to be incorporated in the calculation price indices, but must then contend with more product entry and exit and volatile movements in prices and quantities. Rapid product churn means that fixed-based indices can quickly become unrepresentative of consumer spending patterns, while chained indices often report excessive and unrealistic price changes, a phenomenon known as ‘chain drift’.
To address these issues, a wide range of possible index number methods have been developed. National Statistical Institutes must choose these methods, and the means of implementing them, with care as they can have very different properties. An appropriate framework needs to be produced for assessing indices and their performance, so that new indices can be incorporated as they are developed. This research project reviewed the literature on available index number methods, their properties, current use by national statistical agencies and different means of benchmarking them; and it empirically assessed different approaches and index number methods using UK scanner data.
The ONS had previously documented the properties of different multilateral methods using a scoring system that weighted different properties in the light of stakeholder feedback on their importance. We reviewed this framework in light of the literature on multilateral index number methods. A second strand of the work empirically assessed different methods using long running UK scanner data. This data enabled us to assess the use the extent of chain drift for different indices, window lengths and splicing methods for a large number of goods and over a long time period.
We argue that with only two quite reasonable changes to the ONS scoring system, Geary-Khamis loses its top ranking to Caves-Christensen-Diewert-Inklaar (“CCDI” or GEKS-Törnqvist) and GEKS-Fisher, both using the mean splice. The selection of properties and the chosen weights can be contested. We found that the Geary-Khamis method is very sensitive to the extension method – a finding is consistent with the emerging empirical literature. Empirical evidence also suggests that either the regular mean splice or the mean splice on the published series are to be preferred when splicing indices. Indices calculated using the mean splice produced more stable results across index number methods than alternative splicing methods. Long window lengths – of 25 months or more – are required to significantly reduce the degree of chain drift when splicing. Empirically, the CCDI index performs well empirically against the benchmarks we considered.
This research is aiding ONS in choosing the best index number methods for the use of alternative data sources in CPIH and CPI as part of the consumer prices transformation process. The research has also led ONS to reflect on their consumer prices index number method framework and the way our index number methods will be selected for various data sources in future. They have plans to integrate these new sources of data into the production of its aggregate measures of consumer price statistics by Q1 2023.
Kevin Fox ‘New index number methods in Consumer Price Statistics’ ESCoE Research Seminar, 10 Feb 2018.