Using machine learning to detect misstatements

  • PDF / 959,309 Bytes
  • 52 Pages / 439.642 x 666.49 pts Page_size
  • 20 Downloads / 269 Views

DOWNLOAD

REPORT


Using machine learning to detect misstatements Jeremy Bertomeu1

· Edwige Cheynel1 · Eric Floyd2 · Wenqiang Pan3

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Abstract Machine learning offers empirical methods to sift through accounting datasets with a large number of variables and limited a priori knowledge about functional forms. In this study, we show that these methods help detect and interpret patterns present in ongoing accounting misstatements. We use a wide set of variables from accounting, capital markets, governance, and auditing datasets to detect material misstatements. A primary insight of our analysis is that accounting variables, while they do not detect misstatements well on their own, become important with suitable interactions with audit and market variables. We also analyze differences between misstatements and irregularities, compare algorithms, examine one-year- and two-year-ahead predictions and interpret groups at greater risk of misstatements. Keywords Restatement · Manipulation · Earnings management · Machine learning · Data analytics · Regression tree · Misstatement · Irregularity · Fraud · Prediction · SEC · Enforcement · Gradient boosted regression tree · Data mining · Accounting · Detection · AAERs JEL Classification C63 · D83 · G38 · K22 · K42 · M41

Machine learning is a broad discipline that has designed learning algorithms that can drive cars, recognize spoken language, and discover hidden regularities in growing volumes of data. Archival financial research relies on data streams capturing firm characteristics, governance attributes, audit reports, market data, and accounting

We gratefully thank B. Cadman, P. Dechow, C. Lennox, S.X. Li, D. Macciocchi, M. Plumlee, X. Peng, and seminar participants at LSE, University of Utah, the USC-UCLA-UCSD-UCI conference, MIT, and the CMU Accounting Mini Conference for valuable feedback. We also thank J. Engelberg for the many suggestions that were central in seeding the project.  Jeremy Bertomeu

[email protected]

Extended author information available on the last page of the article.

J. Bertomeu et al.

variables. Machine learning algorithms detect complex patterns in this data, select the best variables to explain an outcome variable, and uncover suitable combinations of variables to make accurate out-of-sample predictions. These algorithms are a key to unlocking the large - and growing - financial data sources to make better predictions and smarter decisions. This paper offers preliminary steps to applying this technology in accounting. We motivate the method by answering a practical question: How do we detect ongoing accounting misstatements? We focus on restatement items 4.02(a) “Non-Reliance on Previously Issued Financial Statements or a Related Audit Report or Completed Interim Review,” which are, in principle, restatements that materially affect the interpretation of accounting numbers by investors. These misstatements differ from irregularities because they need not be fraudulent nor carry evidence of manageri