08/04/2021 | Press release | Distributed by Public on 08/04/2021 02:11
Prepared by Dominik Hirschbühl, Luca Onorante and Lorena Saiz
Policymakers take decisions in real time based on incomplete information about current economic conditions. Central banks and economic analysts largely rely on official statistics together with soft data and surveys, to assess the state of the economy. Although a wide range of high-quality conventional data is available, the datasets are released with lags ranging from a few days or weeks to several months after the reference period. For these reasons, central banks have been looking at ways to exploit timelier data and employ more sophisticated methods to enhance accuracy when forecasting metrics that are relevant for policymaking.
Over recent years, policy institutions have started to explore new sources of data and alternative statistical methods for the real-time assessment of economic activity. Since the financial crisis, they have stepped up their efforts to systematically use micro and survey data to better gauge changes in aggregate consumption, investment and the labour market. In parallel, technological advances have allowed users to start examining unconventional sources such as text data and images from newspaper articles, social media and the internet together with numerical data from payments. Also now available are alternative statistical methods such as regression trees, neural networks and support-vector machines that may help the potential insights that can be gained from these data sources to be fully exploited.
The coronavirus (COVID-19) pandemic has accelerated this trend. The crisis associated with the pandemic has shown that 'big data' can provide timely signals on the state of the economy and help to track economic activity alongside more traditional data. Big data are commonly characterised as having three Vs: high volume, high velocity and high variety.
This article reviews how policy institutions - international organisations and central banks - use big data and/or machine learning methods to analyse the business cycle. Specifically, these new data sources and tools are used to improve nowcasting and short-term forecasting of real GDP. They are also employed to gain useful insights for assessing cyclical developments and building narratives. A number of illustrative examples are provided.
The article is organised as follows. Section 2 reviews the main sources of big data that central banks and other policy institutions have been exploring for business cycle analysis over recent years. It provides an overview of existing literature and also includes two examples of how big data have been used to monitor economic activity and labour market developments during the pandemic. Section 3 discusses the main advantages of ML methods in dealing with big data and analysing the business cycle. This section includes two examples using newspaper articles to build measures of economic sentiment and economic policy uncertainty. Section 4 presents the main conclusions and discusses opportunities and challenges faced by central banks when using machine learning and big data.
2 How do big data help to gauge the current state of the economy?
Policy institutions have recently started to incorporate structured and unstructured big data in their economic analysis. Big data can be structured - such as those collected in large financial datasets that can be matched to firm-level financial statements - or unstructured. Unstructured data range from large and near-real-time data gleaned from the internet (e.g. internet search volumes, data from social networks such as Twitter and Facebook, newspaper articles) to large-volume data obtained from non-official sources (e.g. trading platforms and payment systems or GPS-based technologies).
Structured data, such as those from financial and payment transactions, can provide critical real-time information for assessing aggregate consumption and economic activity. As the use of credit and debit cards to purchase goods and services has increased, the underlying financial transaction data have provided useful information to track consumption and economic activity. At the same time, payments data are available promptly and subject to few revisions since they are financial records. Central banks had already started to regard these data as a valuable source of information before the pandemic emerged. Analysis based on data for the Netherlands, Norway. Portugal and Spain, among others, finds that retail payment systems data (i.e. credit and debit card payments at the point of sale and ATM withdrawals) helped retail sales, private consumption (especially of non-durables) and even real GDP to be forecast in the previous expansionary phase.For Italy, some gains in forecast accuracy have been reported when information from highly aggregated but large value payments (i.e. TARGET2) has been included in GDP nowcasting models.
Turning to unstructured big data, the use of text data from newspapers to understand and forecast the business cycle has increased significantly in the recent years. In business cycle analysis, text data from newspapers and social media have been used to construct proxy measures for unobservable variables such as 'sentiment' or 'uncertainty' which are likely to be associated with macroeconomic fluctuations. These proxies can be obtained at relatively low cost (in contrast to expensive survey-based measures) and on a timely basis (e.g. daily) by means of automated natural language processing methods. For instance, news-based sentiment indicators can serve as early warning indicators of financial crises.Newspaper-based sentiment and economic policy uncertainty indexes for Italy and Spain have proved helpful in monitoring economic activity in real time and nowcasting GDP. Similarly, in Belgium daily average economic media news sentiment is found to be useful for nowcasting survey-based consumer confidence. At the ECB, newspaper-based daily sentiment indicators have been estimated for the four largest euro area countries and the euro area as a whole. These indicators demonstrate a high correlation with survey-based sentiment indicators and real GDP; they are also found to be useful for nowcasting GDP, particularly at the beginning of the quarter when other more traditional indicators (e.g. surveys) referring to the current quarter have not been released yet (see Box 3 in Section 3). In addition, economic policy uncertainty indexes have been estimated for the same set of countries. The ML methods employed also allow uncertainty to be decomposed into sub-components that point towards the main sources (see Box 4 in Section 3).
Similarly, the use of internet searches has also started to feature in short-term forecasting models. Several Eurosystem studies show that internet searches can provide information about future consumption decisions. Recent examples include analysis linking Google search data to euro area car sales, the use of Google search data to enhance German GDP nowcasting model and the analysis exploiting synthetic indicators based on Google searches for forecasting private consumption in Spain. For the euro area as a whole, Google data provide useful information for GDP nowcasting when macroeconomic information is lacking (i.e. in the first four weeks of the quarter), but as soon as official data relating to the current quarter become available, their relative nowcasting power diminishes.
Internet-based data can also help when assessing the tightness of the labour and housing markets. Analysis for the US labour market shows that including Google-based job-search indicators improves the accuracy of unemployment forecasts, particularly over the medium-term horizon (i.e. three to 12 months ahead).In the euro area, a measure of labour market tightness based on the number of clicks on job postings has recently been built for the Irish economy. For the housing market, analysis for Italy found that metrics based on web-scraped data from an online portal for real estate services can be a leading indicator of housing prices. During the pandemic, Google searches on topics related to job retention schemes and layoffs provided early insight into the strong impact of the pandemic and related policy measures. Moreover, online data on job posting and hiring in the euro area have complemented official statistics (see Box 1).
Monitoring labour market developments during the pandemic
Prepared by Vasco Botelho and Agostino Consolo
This box shows how high-frequency data on hiring was helpful for monitoring labour market developments in the euro area during the pandemic. The COVID-19 crisis had a large downward impact on the number of hires in the euro area labour market. Lockdowns and other containment measures suppressed labour demand and discouraged the search efforts of some workers who lost their jobs and transitioned into inactivity.
The LinkedIn hiring rate complements the information that can be retrieved from the official statistical data, providing a timely, high-frequency indicator on gross hires in the euro area during the pandemic.Hires in the euro area can only be observed imperfectly in the official statistical data, by analysing transitions between employment and non-employment. Two main caveats arise when using official data to assess hire behaviour in the euro area. First, official data are not very timely, generally only becoming available around two quarters later. Second, these data only allow quantification of net flows into (or out of) employment and do not provide any information on job-to-job transitions. The LinkedIn hiring rate provides a more timely, high-frequency signal that can provide information on the number of hires in the euro area. It comprises high-frequency data on gross hires, identifying both movements from non-employment into employment and job-to-job transitions.
The standardised LinkedIn hiring rate is first calculated for each of the four largest euro area countries (France, Germany, Italy and Spain - the EA-4) by filtering out seasonal patterns and country-specific artificial trends related to the market performance of LinkedIn. The EA-4 country information is aggregated as a weighted average of the country-specific standardised hiring rates using employment as weights. The EA-4 hiring rate declined significantly at the start of the pandemic before recovering during the second half of 2020 (Chart A, panel (a)). After standing at around 6% above average during the first two months of 2020, it fell suddenly to 63% below average in April 2020 following the onset of the COVID-19 crisis and slowly rebounded to surpass its average level in November 2020. It then returned to below average in January 2021, when more stringent lockdowns were imposed, and recovered again thereafter. Interestingly, the decline in the number of hires paralleled the increase in job retention schemes during the pandemic. In April 2021 the standardised hiring rate stood at 14% above average in the EA-4 aggregate.
Monitoring the EA-4 labour market using high-frequency data
The high-frequency information provided by the hiring rate can also be used to assess fluctuations in the unemployment rate during the pandemic. Following the box entitled 'High-frequency data developments in the euro area labour market ' in Issue 5/2020 of the ECB's Economic Bulletin, we conduct a forecasting exercise linking the high-frequency information of the LinkedIn hiring rate to the job finding rate and using the implied path of the aggregate job finding rate as a proxy for the point-in-time, steady-state unemployment rate. This is then used to forecast the fluctuations in the unemployment rate during the pandemic.We thus compare the observed fluctuations in the unemployment rate from March 2020 onwards with those implied by the high-frequency information within the standardised hiring rate for the EA-4 aggregate.
The forecast for the unemployment rate using the high-frequency hiring rate provides an early signal of the increase in the unemployment rate for the EA-4 aggregate. Chart A (panel (b)) compares the actual unemployment rate with the ex ante conditional forecast of the unemployment rate using the high-frequency hiring rate and based on the unemployment rate in December 2019. The early signal peak in the unemployment rate forecast in April 2020 at 8.8% is comparable in magnitude with the later August 2020 peak in the actual unemployment rate at 9.1%. More recently, in March 2021 the actual unemployment rate of the EA-4 aggregate was 8.5%, within the plausible range of between 7.8% and 8.7% forecast using the high-frequency hiring rate. The early peak for the forecast unemployment rate was driven by the contraction in the high-frequency hiring rate, which reflected the hiring freezes that followed the widespread use of job retention schemes and allowed separations to remain broadly constant over the initial period of the pandemic. By contrast, most of the recent variation in the unemployment rate (including its stabilisation) has stemmed from an increase in the separation rate.
The experience gained with structured and unstructured data prior to the pandemic made it easier to deploy models quickly to facilitate the real-time assessment of the economic situation during the pandemic. In particular, these data have been used to assess the degree of slack in the labour market and to measure the decline in economic activity, seen from both the supply and the demand side. During this period of sudden economic disruption, high-frequency alternative data such as electricity consumption, card payments, job postings, air quality and mobility statistics have been crucial for gaining a timely picture of the economic impact of the pandemic and the associated containment measures, weeks before hard and survey data were released. Payment data have been key to understanding the developments in private consumption, one of the demand components most severely affected by the crisis.Consumption of key inputs such electricity, gas and fuel was used as a proxy for production in some sectors. A timely understanding of developments in the services sector, with a special focus on small businesses in certain service activities such as tourism which have borne the brunt of the crisis, was also very important. High-frequency information available for these sectors related mostly to sales (e.g. sales in tax returns, card payments), online bookings and Google searches. Other indicators such as freight movements, numbers of flights and air quality were informative as rough proxies for economic activity.
One effective way of summarising information from a set of high-frequency indicators is to use economic activity trackers. Box 2 provides an example of a weekly economic activity tracker for the euro area devised by the ECB. Similarly, the European Commission's Joint Research Centre and Directorate-General for Economic and Financial Affairs have been tracking the COVID-19 crisis by combining traditional macroeconomic indicators with a high number of non-conventional, real-time and extremely heterogeneous indicators for the four largest economies in the euro area.They have developed a toolbox with a suite of diverse models, including linear and non-linear models and several ML methods, to exploit the large number of indicators in the dataset for nowcasting GDP. The GDP forecasts are produced by first estimating the whole set (thousands) of models and then applying automatic model selection to average out the forecasts and produce the final forecast.
A weekly economic activity tracker for the euro area
Prepared by Gabriel Pérez-Quirós and Lorena Saiz
Since the onset of the pandemic, several central banks and international institutions have developed experimental daily or weekly economic activity trackers by combining several high-frequency indicators.
Although these indicators are appealing, their development presents three key technical issues. First, the short time span available for high-frequency data makes them less reliable for establishing econometric relations which prove stable over time, compared to long time series of monthly economic indicators.Second, high-frequency indicators are extremely noisy, exhibit complex seasonal patterns and, in some cases, may be subject to frequent data revisions. In the special circumstances associated with the COVID-19 crisis, these indicators were very informative (i.e. the signal-to-noise ratio was high), but in normal times it is still open to question whether these will only add noise to the already reliable signal obtained from the standard monthly indicators. Third, the procedure to select indicators has not been standardised. Up to now, most work has used high-frequency indicators that are readily available for each economy. The lack of harmonised selection procedures reduces the scope to 'learn from the cross-section' and accentuates the representativeness problem mentioned above.
The weekly economic activity tracker for the euro area proposed in this box addresses these issues by combining reliable monthly indicators that have a long history of good predictive performance with timely high-frequency (non-standard) indicators. The indicators have been selected according to several criteria: (i) availability of a long enough history (at least three years), (ii) not too noisy, and (iii) the weight of the indicator in the aggregate that combines all of them (a principal component in the case of the indicator discussed here) is statistically significant and economically meaningful.
The design of the tracker is based on principal component analysis (PCA) with unbalanced data, as described by Stock and Watson.First, a tracker using only weekly series is computed by PCA to fill the missing observations at the beginning and, if necessary, the end of the sample. The weekly series are transformed into month-on-month growth rates. If necessary, seasonal adjustment methods are used to eliminate any seasonal effects. Second, the monthly variables are transformed into weekly frequency by imputing the same monthly level for all weeks of the month. Then, the month-on-month growth rates are computed for every week. With all this information, the PCA is run again including all the indicators which were originally available at weekly and monthly frequency. The first principal component is the tracker, which represents the evolution of monthly activity on a weekly frequency (Chart A, panel (a)). Visualising the tracker in levels and monthly frequency gives an idea of the magnitude of the output loss associated with the pandemic compared with pre-pandemic levels. Most importantly, the evolution of the tracker in levels over 2020 mirrors the evolution of GDP very well (Chart A, panel (b)). Overall, the relatively good performance of the tracker, which strikes a good balance between timely and reliable indicators, makes it a useful tool for tracking economic activity in real time.
Euro area economic activity tracker
3 What makes machine learning algorithms useful tools for analysing big data?
While big data can help improve the forecasts of GDP and other macroeconomic aggregates, their full potential can be exploited by employing ML algorithms. Section 2 shows that in many cases, the improvement in forecasting performance relates to specific situations, such as when traditional monthly indicators for the reference quarter are not yet available. This section focuses on the modelling framework, arguing that ML methods help to reap the benefits of using big data. The main goal of ML techniques is to find patterns in data or to predict a target variable. Although ML algorithms estimate and validate predictive models in a subset of data (training sample), the ultimate aim is to obtain the best forecasting performance using a different subset of data (test sample). The distinction between machine learning and traditional methods is not clear-cut since some traditional methods (e.g. linear regression, principal components) are also quite popular in the ML literature. However, the literature on machine learning has developed a host of new and sophisticated models that promise to strongly enrich the toolbox of applied economists. Moreover, it also seems fair to say that, so far, machine learning has been mostly focused on prediction, while more traditional econometric and statistical analysis is also interested in uncovering the causal relationships between economic variables.This is changing fast, as more and more researchers in the ML field address the issue of inference and causality, although this frontier research is not yet widely applied in the policy context. The aim of this section is to discuss how machine learning can usefully complement traditional econometric methods, in particular to leverage the opportunities for analysing the business cycle offered by big data. It also reviews several contributions to forecasting/nowcasting GDP (see Box 3) and provides examples of how ML algorithms can provide interesting insights for policy, such as pointing towards the sources of economic policy uncertainty (see Box 4).
The size of the newly available databases in itself often constitutes an obstacle to the use of traditional econometrics. Techniques have been adopted to reduce the dimensionality of the data, including traditional methods such as factor models and principal component analysis, but more often going into newer versions of machine learning. While a description of specific methods is beyond the scope of this article, it is important to note that ML methods have several desirable features for summarising the data, allowing precise reduction of high-dimensional data into a number of manageable indicators.
The first key advantage of ML methods is their ability to extract and select the relevant information from large volumes of, unstructured data. When dealing with big data, the presence of a large amount of mostly irrelevant information engenders the problem of data selection. This issue is magnified by the presence of large, unstructured datasets.In some simple cases, the forecaster can pick variables manually; this is normally possible when forecasting very specific quantities. The seminal work of Choi and Varian with Google Trends, for instance, focuses on car sales, unemployment claims, travel destination planning and consumer confidence. Where macroeconomic aggregates are involved, the choosing of relevant variables quickly becomes intractable. ML methods offer very useful tools for selecting the most informative variables and exploiting their information potential. Several techniques derived from the model-averaging literature have also proved popular and successful in improving forecasting accuracy. In these methods, a large number of econometric models are first estimated, their forecasting performance is then evaluated, and the final forecast is obtained by averaging the forecasts of the best models, thus retaining those models and explanatory variables that provide useful information. Similarly, what are known as ensemble methods such as random forests and bagging combine different 'views' of the data given by competing models, adding flexibility and robustness to the predictions.
The second key advantage of ML methods is their ability to capture quite general forms of non-linearities. This is a general advantage of ML methods, regardless of the volume of data concerned; however, the issue is that, by their very nature, big data may be particularly prone to non-linearities. For instance, the data stemming from social networks present a good way to understand these inherent non-linearities. In this case, a specific topic can generate cascade or snowball effects within the network which cannot be channelled in linear regression models. Other examples include Google Trends and Google search categories, which are compiled using ML algorithms that determine the category to which an internet search belongs.Text data are also obtained by applying highly non-linear ML algorithms to news items, for example. More generally, non-linearities and interactions between variables are common in macroeconomics owing to the presence of financial frictions and uncertainty. Several works have found that ML methods can be useful for macroeconomic forecasting, since they better capture non-linearities (e.g. Coulombe et al.). These methods can, for instance, capture the non-linear relationship between financial conditions and economic activity, among others, and hence more accurately predict activity and recessions in particular (see Box 3). Also, ML methods can outperform standard methods (e.g. credit scoring models, logistic regression) when predicting consumer and corporate defaults, since they capture non-linear relationships between the incidence of default and the characteristics of the individuals.
The COVID-19 pandemic is an important source of non-linearities. During the pandemic, many macroeconomic variables have recorded extreme values that are far from the range of past values. Econometric methods such as linear time series analysis seek to find average patterns in past data. If current data are very different, linearly extrapolating from past patterns may lead to biased results. Central banks, the European Commission and other institutions have adapted their nowcasting frameworks to capture non-standard data and non-linearities.
Finally, ML techniques are the main tool used to capture a wide set of phenomena that would otherwise remain unquantified. The most prominent example in recent years is the dramatic surge of text data analysis. Today, broad corpuses of text are analysed and converted into numbers that forecasters can use. For instance, a wide range of timely, yet noisy confidence indicators based on text currently complement the traditional surveys, which are available with considerable lags and where agents do not necessarily 'vote with their behaviour', as well as market-based indicators, where expectations and other factors such as risk aversion compound in the data. A first generation of work built on word counts has been followed by more sophisticated approaches.Second-generation techniques based on unsupervised learning are also used in public institutions, and in particular in central banks, to assess the effect of their communication. Finally, following Baker et al., concepts such as economic policy uncertainty which were previously difficult to quantify are now currently assessed on the basis of their economic consequences and used in forecasting. See Box 3 and Box 4 for examples.
Nowcasting euro area real GDP growth with newspaper-based sentiment
Prepared by Julian Ashwin, Eleni Kalamara and Lorena Saiz
This box presents economic sentiment indicators for the euro area derived from newspaper articles in the four largest euro area countries in their main national languages.
In the literature, two approaches are typically followed for building sentiment metrics from textual data. The most popular is to use simple word counts based on predetermined sets of words, known as dictionaries or lexicons. However, most of the dictionaries have been developed for the English language. For the euro area, the multilingual environment makes it necessary to either develop new dictionaries for other languages or translate texts into English. Alternatively, more computationally demanding model-based methods such as semantic clustering or topic modelling can extract topics which can be approximated to sentiment and its drivers. In this box, the sentiment metrics are based on counts of words in the news articles translated into English, relying on several well-known English language dictionaries.For the sake of space, only the sentiment metrics based on the financial stability-based dictionary and the general-purpose dictionary VADER are reported.
Regardless of the dictionary used, and despite some noisiness, the newspaper-based sentiment metrics are highly correlated with the PMI composite index in the period from 2000 to 2019 (Chart A, panel (a)). This confirms that these measures are actually capturing sentiment. However, the choice of dictionary matters when it comes to detecting turning points. The first sentiment metric captures the Great Recession very well, unsurprisingly given the financial nature of this crisis. But this metric fails to encapsulate the COVID-19 crisis (Chart A, panel (b)), although its evolution is consistent with the behaviour of the financial markets and the financing conditions which have remained favourable in the context of very strong policy response. By contrast, the general-purpose dictionary is more consistent and robust across time. Therefore, it appears that the nature of economic shocks may play a significant role in identifying the most appropriate text dictionary to be used.
PMI and newspaper-based sentiment indexes for the euro area
Various studies have found that text analysis can significantly improve forecasts of key macroeconomic variables.Some forecast accuracy gains (not shown) are found for real-time GDP nowcasts derived using the PMI composite index and the text-based sentiment indicators as key predictors. They are typically concentrated in the nowcasts produced in the first half of the quarter (i.e. first six weeks), when most other indicators used to nowcast GDP are not yet available. This result is in line with other works in the literature. However, an important point is that the type of model matters to fully reap the true benefits of the timeliness of text-based information. Standard linear methods (e.g. ordinary least squares linear regression) work well in calm times when there are no big shifts in the economic outlook. When extreme economic shocks occur, however, ML models can capture non-linearities and filter out the noise (Chart B). Ridge regressions captured the financial crisis better, as shown by the fact that they have the lowest Root Mean Squared Forecast Error (RMSFE), particularly when including the sentiment metric based on the financial stability dictionary. However, the best-performing models during the pandemic have been the neural networks, which were the worst-performing models during the financial crisis. This could be explained by the fact that before the financial crisis, there were no other similar crises in the training sample from which the model could learn. Indeed, one of the criticisms of the more complex ML models is that they need large amounts of data to learn (i.e. they are 'data hungry').
Sources of economic policy uncertainty in the euro area and their impact on demand components
Prepared by Andrés Azqueta-Gavaldón, Dominik Hirschbühl, Luca Onorante and Lorena Saiz
This box describes how big data and machine learning (ML) analysis can be applied to the measurement of uncertainty using textual data. Similarly to 'economic sentiment', uncertainty is not directly observable and can only be measured using proxies. Recent developments in the literature have shown that textual data can provide good proxies for this latent variable. For instance, the seminal work by Baker, Bloom and Davies proposed building an economic policy uncertainty (EPU) index using a pre-specified set of keywords in newspaper articles.
Economic policy uncertainty stems from different sources which affect consumers' and firms' decisions differently. For instance, increases in uncertainty regarding future tariffs can have an impact on a firm's determination to build a new production plant or to start exporting to a new market. This is because the role of future conditions is particularly relevant for costly, irreversible decisions. By contrast, uncertainty about the future monetary policy stance can be important for both firms' and consumers' spending decisions, since it will influence their expectations about future economic developments and financing conditions.
A simple structural vector autoregression (SVAR) analysis confirms that increases in (ML-based) EPU have a significant negative impact on private consumption and business investment proxied by investment in machinery and equipment in the euro area. The impact on investment is greater than on consumption, suggesting that uncertainty may have more of an impact on the supply side.As regards sources of economic policy uncertainty, the focus is only on energy, trade and monetary policy uncertainty for the sake of space. As expected, monetary policy uncertainty shocks have a clear negative impact on both investment and consumption. By contrast, the impact of increases in trade policy uncertainty is insignificant in both cases. Moreover, increases in energy policy uncertainty depress consumption to a greater extent than other sources, while their effect on investment, albeit weaker, is more persistent over time. While these are aggregate results, EPU is likely to play a more relevant role for firm-level capital investment than at aggregate level.
Impulse responses of consumption (panel (a)) and investment (panel (b)) to economic policy uncertainty shocks
4 Conclusions, challenges and opportunities
This article has described how big data and ML methods can complement standard analysis of the business cycle. A case in point is the coronavirus pandemic, which represents an extraordinary shock. This crisis has propelled the dissemination and refinement of ML techniques and big data at an unprecedented speed. In particular, it has shown that alternative sources of data can provide more timely signals on the state of the economy and help to track economic activity. Furthermore, it is an important showcase for non-linearities in the economy, which has required existing models to be adapted or new approaches to be developed. In this respect, ML methods can deal with non-linearities more easily than traditional methods. Besides new opportunities, these new data sources and methods also pose some challenges.
Big data allow a wider range of timely indicators to be used for forecasting (e.g. text-based or internet-based indicators), although in some cases this can entail replicability and accountability issues. Text-based sentiment indicators are particularly useful, for instance, given that they can be produced automatically at higher frequency and at lower cost than survey-based indicators. While the construction of conventional economic data, such as industrial production, follows harmonised procedures to ensure high quality, continuity and comparability over time and countries, alternative data are neither collected primarily for economic analysis, nor sourced and validated by independent statistical offices. Therefore, their application in decision-making processes exposes central banks to various risks, given that the replicability of results and accountability could be impaired. Since alternative data are collected for other purposes (e.g. credit card transactions) or come as the by-product of another service (e.g. news articles from the digitisation of newspapers), the data are often very noisy and require careful treatment. Moreover, the existence of significant data accessibility issues and limitations to data sharing could impair the replicability of the results in some cases. All these risks require careful consideration when investing scarce resources in software development and legal issues, as well as customising IT infrastructure.
Although useful as complements, at the moment these tools cannot be considered as substitutes for standard data and methods due to issues of interpretability and statistical inference. ML methods can help overcome the shortcomings of big data and exploit their full potential. When combined with big data, ML methods are capable of outperforming traditional statistical methods and providing an accurate picture of economic developments. Despite the good forecasting performance, the complexity of the methods often makes it difficult to interpret revisions to the forecasts and most importantly to communicate them. However, rapid advances are being made on enhancing the interpretability of ML techniques (most recently based on Shapley values).In addition, ML techniques are not originally designed to identify causal relationships, which is of critical importance to policymakers. Enhancing the ability of ML methods to capture causality is currently the biggest challenge; this has the potential to make ML techniques promising complements and viable alternatives to established methods.