Introduction
to Data Analysis for Auditors and Accountants
By Alexander Kogan, PhD, Miklos A.
Vasarhelyi, PhD and Deniz Appelbaum
February 2017
In
Brief
The audit world is changing. Technology
has transformed business processes (企業流程) and created a wealth of data
that can be leveraged by accountants and auditors with the requisite mindset. Data analysis can enable auditors to
focus on outliers and exceptions, identifying the riskiest areas of the audit. The authors introduce the process,
with a review of some emerging approaches and compilation of useful resources
for auditors new to the topic.
* * *
The advent of inexpensive computational power and storage, as well as the
progressive computerization of organizational systems, is creating a new
environment in which accountants and auditors must practice. This article aims
at introducing basic data analysis concepts to enable accounting
professionals to understand how to navigate within this new environment.
Specifically, the focus will not be on auditing and accounting standards and
their current required procedures, but rather on what the profession can
progressively achieve with data analytics. Most analytical procedures, in the right circumstances, may be
applicable to the entire audit process, from risk assessment to test of
details. What follows is a step-by-step overview (Exhibit 1) of best practices
for the process of applying
analytics, with an emphasis on audit by exception (ABE).
The
Steps in the Process:
l Flowcharting the process.
l Choosing and extracting
the data.
l Understanding the
population.
l Understanding the fields
with descriptive statistics.
l Exploratory data analysis (探索式資料分析).
l Choice of analytic methods
and alternative approaches.
l Confirmatory data analysis (驗證性資料分析) and finding
outliers.
l Evaluating results
evaluation and integrating with traditional findings.
Flowcharting
the process.
Understanding the elements of a certain
cycle or application is essential for selecting data and understanding risk.
Many tools are
available for flowcharting (流程圖製作), such as Tableau Public,
QlickSense, and RapidMiner, all of which are free. Flowcharting is also possible in Microsoft Excel or PowerPoint. Exhibit 2 shows a sample flowcharting process taken
from an insurance company.
Choosing
and extracting the data.
With the risks in mind, the next step is to choose the data fields (資料欄位) to be extracted and examined. This
type of analysis is not very different from what would be done on a traditional audit. A progressively
increasing number of audit apps are being sold or shared that can serve to simplify the audit task (e.g.,
http://www.capterra.com/audit-software/). Unfortunately, providers have not yet standardized around the
AICPA’s Audit Data Standards (ADS) or any other common standard.
Nevertheless, many audit software
providers (e.g., ACL and
CaseWare) have extensive libraries of scripts (腳本) that can be adapted
to various data formats, as well as extraction
software that allows for access to traditional data and enterprise resource
planning (ERP) systems (e.g., SAP and Oracle).
Understanding
the population.
It is very important for the sake of
completeness to understand the nature, distribution,
and limitations of the population to be tested. Understanding the scope and
limitations of the data is imperative, as it enables an accountant to choose
the most appropriate and effective analytical technique.
Understanding
the fields with descriptive statistics.
The examination of key fields for their
characteristics and statistical parameters (e.g., maximum, minimum, median,
variance) and data availability (e.g., missing values) is probably the most
important initial task, but one that is often underappreciated or even
neglected.
Exploratory
data analysis.
Modern tools of visualization (e.g., Tableau or Excel) allow
for data exploration that helps auditors carefully choose where to place their
analytic efforts and which assertions to test. Auditors can focus more
extensive testing on the areas highlighted as highest risk.
Choice
of analytic methods and alternative approaches.
A great number of analytic methods have been applied to audits in
a research mode (Deniz Appelbaum, Alexander Kogan, and Miklos Vasarhelyi,
Analytics for External Auditing: A Literature Review, Rutgers CARLab, 2016) and
are being progressively adopted by CPA firms. Exhibit 3 provides examples of
several analytic methods. Given this variety
of choices, auditors need to know
the data as intimately as possible, as well as understand the specific
analytic task, in order to reduce the pool of potential analytical methods.
Confirmatory
data analysis and finding outliers.
Having identified the riskiest areas of the audit,
an auditor should next use some of the techniques discussed above to evaluate
the data. These techniques are used first to infer analytic models to provide
audit benchmarks or expectations; the actual
values are then compared with the benchmarks. Any significant deviations should be investigated by auditors. For
example, regression analysis
can be used to derive a model
for the revenue account based on archival data. The values calculated by this model should be
compared against the actual revenue amounts, and any significant
differences investigated.
Evaluating
results evaluation and integrating with traditional findings.
Ideally, the outliers should be segregated
from the population for more detailed audit examination, as discussed above. In
such an audit by exception (ABE) approach (Exhibit 4), an auditor’s attention
is more focused on the problematic
transactions rather than a traditional sample pool (which may or may not
identify problematic transactions). Theoretically, ABE provides a more efficient and effective
approach for identifying questionable numbers.
Because this examination process is not sample-based
but exception-based, it
represents a significant departure from the currently prevalent audit practice
of statistical sampling. The main difference
between the ABE and a sample-driven audit is how the subset to be examined is obtained.
Both approaches start with the entire population, but an ABE tests every
transaction and ultimately focuses only on those transactions that present problems (Exhibit 5),
whereas a statistical sample does not
test every transaction, as the sample purportedly represents the diversity and
content of the entire population. If, however, the error-prone transactions as
determined by the ABE tests represent, for example, less than .15% of the
population, a sample of 60 transactions may or may not include even one data
point that is significantly deviant, whereas every one of these .15% outlier transactions would be flagged for detailed testing
by an ABE.
Nevertheless, many auditors and accountants
may not initially feel comfortable with conducting an ABE of 100% of the
population, unless this ABE examination were to be accompanied by a traditional
statistical sample. The results of the ABE would then be examined in detail,
just as currently the samples pulled are tested, with the findings compared and
reported.
It is worth remembering that sampling became an accepted audit practice during a time when data sets were expanding
in size but auditors were still examining transactions manually. Detailed examinations of entire datasets
were infeasible at that time.
Now that automated audit software capable of testing datasets rapidly with
minimal manual involvement from the auditor exists, this obstacle is no longer
an issue.
Emerging
Approaches
Although many of them have not yet been
included auditors’ daily repertoire nor codified in audit standards, there are
many emerging data analytics
approaches that could assist with the audit process. Some of these are shown in
Exhibit 3. The most promising of these approaches are described below.
Predictive
analytics (預測分析).
Carefully validated and highly accurate
predictive analytic models for aggregated
accounting numbers can be used by auditors to reduce the time-consuming effort
of disaggregated
testing if the predicted values and
the values of management assertions are sufficiently close.
Deep
learning (深度學習).
The large audit firms are investing
significant resources into the use of artificial intelligence to take advantage
of their past experiences and industry knowledge. For example, data from
working papers can be used to create automatic protocols for certain audit judgments, such as bad debt estimation,
lease classification, and identification of abnormal contracts. Deep learning uses this knowledge in
tandem with more advanced methods, such as neural
networks (類神經網路), to represent the deeper structure of events
and conditions in multiple layers of the neural network. Another term
associated with deep learning is “cognitive computing認知運算,” a blend of automation and human interpretation. Deep learning requires tremendous computational storage and power,
however, since the learning occurs by combining human expertise with enormous
amounts of data. Many businesses outsource deep learning projects to contractors and research
centers, such as IBM
Watson. It is conceivable that in the near future an “Auditor Watson” could exist that would assist
accounting firms with financial and operational audits.
Blockchain/Smart
contracts (智能合約).
The recent development of the virtual
currency Bitcoin has been facilitated by a technology known as blockchain that can keep data public and replicates many transactions
in a network using encryption
methods. This methodology may presage a fundamental change in methods of data storage
and validation. Smart contracts associated with blockchain might be able to automatically execute contract
features without human intervention. For example, the contract between
the auditor and the firm may dictate that if
an outlier is larger than 100% of the median value of the transactions, it must
be stopped and examined by human eyes; blockchain could theoretically flag such outliers and refer them to an
auditor.
Text
mining文件探勘;文本探勘.
The emergence of big data, and the mixing
of large corporate datasets and
external, unstructured data, allows
for highly promising machine understanding of text that may one day provide
great validation for
management-supplied numbers and support new audit products, such as continuous auditing and monitoring from
external data. Of note is the fact that three of the largest audit firms have
employed legal discovery tools or developed methods to text mine information from converted PDF
documents to create deep learning inputs.
Tools
and Information Sources
More than 700 firms audit public companies,
and many more audit or examine other entities. Smaller firms do not have the extensive financial and human resources that larger ones have, and
thus may not be able to leverage
data analytics technology to the same extent. There are, however, many sources
of free software and
educational materials that are currently available. A selection of these
resources, in addition to commercially available tools, is listed below.
The open source R software has one of the largest library of
applications available. Free
software such as R and
Weka are used nationwide in university courses and by some research and
technology firms, but are somewhat frowned upon by
accounting firms because they are not validated. These concerns are not without merit, since open source
software can be clumsier and less user friendly than proprietary software, but their utility should not be ignored.
In addition, while a basic knowledge of
statistics and information technology is becoming essential for all accountants,
other, more specialized functions can be contracted to other experts, perhaps online.
Proprietary tools such as Audit Command Language (ACL)
and Interactive Data Extraction and Analysis (IDEA), as well as generic statistical software such as
Statistical Analysis System (SAS)
and Statistical Package for the Social Sciences (SPSS), are frequently used by large businesses and large firms. Furthermore, the capabilities and
scope of these packages are constantly evolving, requiring that accountants and auditors have sufficient knowledge of
analytics.
Large firms typically retrain
their professionals through internal
courses about their own approaches to auditing and are progressively trying
to introduce audit analytics
into this process. Four decades ago, each one of the then-Big Eight had its own
IT audit packages, but today the Big Four use vendor-provided
software such as ACL and IDEA.
This convergence will
likely also take place with the emerging statistical and visualization toolsets
being developed.
A major difference in today’s environment is the power of group sourcing
and the diffusion of the Internet. Powerful education mechanisms are emerging,
ranging from free
public resources to online Masters of Accountancy programs in audit analytics,
some of which are financed by major firms (“KPMG, Villanova, Ohio State Launch
First-Of-Its-Kind Data and Analytics Master’s Degree to Prep Data-Age
Auditors,” KPMG, Aug. 4, 2016, http://bit.ly/2jWihzN).
A
Growing Phenomenon
The advent of data analytics and big data is not a fad; it is a
real phenomenon driven by new technologies being adopted by many businesses.
Accountants and auditors are currently very far behind
the curve. The profession will inevitably be forced to modernize audit approaches by corporate
processes that are not auditable by traditional methods, accounting packages
that can perform without manual intervention, and pressure from clients for
more value in the audit engagement.
This article provides a general
introduction to modern analytic methods and sources of information and
education for accountants. Further resources can be found at
http://raw.rutgers.edu/CPAjrefs.html.