The use of scanner and web-scraped data has many advantages, allowing for larger sample sizes and potentially allowing NSIs to publish indices at higher frequency. However, the use of this data also creates new challenges. Scanner data potentially allows many more products to be incorporated in the calculation price indices, but must then contend with more product entry and exit and volatile movements in prices and quantities. Rapid product churn means that fixed-based indices can quickly become unrepresentative of consumer spending patterns, while chained indices often report excessive and unrealistic price changes, a phenomenon known as ‘chain drift’.
To address these issues, a wide range of possible index number methods have been developed. National Statistical Institutes must choose these methods, and the means of implementing them, with care as they can have very different properties. An appropriate framework needs to be produced for assessing indices and their performance, so that new indices can be incorporated as they are developed. This research project reviewed the literature on available index number methods, their properties, current use by national statistical agencies and different means of benchmarking them; and it empirically assessed different approaches and index number methods using UK scanner data.