Board of Governors of the Federal Reserve System

07/01/2022 | Press release | Distributed by Public on 07/01/2022 10:07

Integrating Prediction and Attribution to Classify News

July 2022

Integrating Prediction and Attribution to Classify News

Nelson P. Rayl and Nitish R. Sinha

Abstract:

Recent modeling developments have created tradeoffs between attribution-based models, models that rely on causal relationships, and "pure prediction models" such as neural networks. While forecasters have historically favored one technology or the other based on comfort or loyalty to a particular paradigm, in domains with many observations and predictors such as textual analysis, the tradeoffs between attribution and prediction have become too large to ignore. We document these tradeoffs in the context of relabeling 27 million Thomson Reuters news articles published between 1996 and 2021 as debt-related or non-debt related. Articles in our dataset were labeled by journalists at the time of publication, but these labels may be inconsistent as labeling standards and the relation between text and label has changed over time. We propose a method for identifying and correcting inconsistent labeling that combines attribution and pure prediction methods and is applicable to any domain with human-labeled data. Implementing our proposed labeling solution returns a debt-related news dataset with 54% more observations than if the original journalist labels had been used and 31% more observation than if our solution had been implemented using attribution-based methods only.

Keywords: News, Text Analysis, Debt, Labeling, Supervised Learning, DMR

DOI: https://doi.org/10.17016/FEDS.2022.042

PDF:Full Paper