This vignette accompanies the `jfa`

R package and aims to show how it facilitates auditors in their standard audit sampling workflow (hereafter “audit workflow”). In this example of the audit workflow, we will consider the case of BuildIt. BuildIt is a fictional construction company in the United States that is being audited by Laura, an external auditor for a fictional audit firm. Throughout the year, BuildIt has recorded every transaction they have made in their financial statements. Laura’s job as an auditor is to make a judgment about the fairness of these financial statements. In other words, Laura needs to either approve or not approve BuildIt’s financial statements. To not approve the financial statements would mean that, as a whole, the financial statements contain errors that are considered *material*. This means that the errors in the financial statements are large enough that they might influence the decision of someone relying on the financial statements. Since BuildIt is a small company, their financial statements only consist of 3500 transactions that each have a corresponding recorded book value. Before assessing the details in the financial statements, Laura already tested BuildIt’s computer systems that processed these transactions and found that they were quite reliable.

In order to draw a conclusion about the fairness of BuildIt’s recorded transactions, Laura separates her audit workflow into four stages. First, she will plan the size of the subset she needs to inspect from the financial statements to make a well-substantiated inference about them as a whole. Second, she will select the required subset from the financial statements. Third, she will inspect the selected subset and determines the audit value (true value) of the transactions it contains. Fourth, she will use the information from her inspected subset to make an inference about the financial statements as a whole. To start off this workflow, Laura first loads BuildIt’s financial statements in R.

```
library(jfa)
data("BuildIt")
```

In statistical terms, Laura wants to make a statement that, with 95% confidence, the maximum error in the financial statements is lower than what is considered *material*. She therefore determines that *materiality*, the maximum tolerable error in the financial statements, as 5%. Based on last year’s audit at BuildIt, where the maximum error turned out to be 2.5%, she expects at most 2.5% errors in the sample that she will inspect. Laura can therefore re-formulate her statistical statement as that she wants to conclude that, when 2.5% errors are found in her sample, she can conclude with 95% confidence, that the misstatement in the population is lower than the materiality of 5%. Below, Laura defines the materiality, confidence, and expected errors.

```
# Specify the confidence, materiality, and expected errors.
<- 0.95 # 95%
confidence <- 0.05 # 5%
materiality <- 0.025 # 2.5% expected
```

Many audits are performed according to the *audit risk model (ARM)*, which determines that the uncertainty about Laura’s statement as a whole (1 - her confidence) is a factor of three terms: the inherent risk, the control risk, and the detection risk. Inherent risk is the risk posed by an error in BuildIt’s financial statement that could be material, before consideration of any related control systems (e.g., computer systems). Control risk is the risk that a material misstatement is not prevented or detected by BuildIt’s internal control systems. Detection risk is the risk that Laura will fail to find material misstatements that exist in an BuildIt’s financial statements. The *ARM* is practically useful because for a given level of audit risk, the tolerable detection risk bears an inverse relation to the other two risks. The *ARM* is useful for Laura because it enables her to incorporate prior knowledge on BuildIt’s organization to increase the required risk that she will fail to find material misstatements. According to the *ARM*, the audit risk will then be retained.

\[ \text{Audit risk} = \text{Inherent risk} \,\times\, \text{Control risk} \,\times\, \text{Detection risk}\]

Usually the auditor judges inherent risk and control risk on a three-point scale consisting of low, medium, and high. Different audit firms handle different standard percentages for these categories. Laura’s firm defines the probabilities of low, medium, and high respectively as 50%, 60%, and 100%. Because Laura performed testing of BuildIt’s computer systems, she assesses the control risk as medium (60%).

```
# Specify the inherent risk (ir) and control risk (cr).
<- 1 # 100%
ir <- 0.6 # 60% cr
```

Laura can choose to either perform a frequentist analysis, where she uses the increased detection risk as her level of uncertainty, or perform a Bayesian analysis, where she captures the information in the control risk in a prior distribution. For this example, we will show how Laura performs a frequentist analysis. In a frequentist audit, Laura immediately uses the adjusted confidence to calculate the sample size using the `planning()`

function.

```
# Adjust the required confidence for a frequentist analysis.
<- 1 - ((1 - confidence) / (ir * cr))
c.adj # Step 1: Calculate the required sample size.
<- planning(materiality = materiality, expected = expected, conf.level = c.adj) stage1
```

Laura can then inspect the result from her planning procedure by using the `summary()`

function. Her result tells her that, given her prior distribution she needs to audit a sample of 178 transactions so that, when at most 4.45 errors are found, she can conclude with 91.66% confidence that the maximum error in BuildIt’s financial statements is lower the materiality of 5%.

`summary(stage1)`

```
##
## Classical Audit Sample Planning Summary
##
## Options:
## Confidence level: 0.91667
## Materiality: 0.05
## Hypotheses: H0: T >= 0.05 vs. H1: T < 0.05
## Expected: 0.025
## Likelihood: poisson
##
## Results:
## Minimum sample size: 178
## Tolerable errors: 4.45
## Expected most likely error: 0.025
## Expected upper bound: 0.049986
## Expected precision: 0.024986
## Expected p-value: < 2.22e-16
```

Laura is now ready to select the required 178 transactions from the financial statements. She can choose to do this according to one of two statistical methods. In *record sampling* (`units = "items"`

), inclusion probabilities are assigned on the transaction level, treating transactions with a high value and a low value the same, a transaction of $5,000 is equally likely to be selected as a transaction of $1,000. In *monetary unit sampling* (`units = "values"`

), inclusion probabilities are assigned on the level of individual monetary units (e.g., a dollar). When a dollar is selected to be in the sample, the transaction that includes that dollar is selected. This favors higher transactions, as a transaction of $5,000 is five times more likely to be selected than a transaction of $1,000.

Laura chooses to use *monetary unit sampling*, as she wants to include more high-valued transactions. The `selection()`

function allows her to sample from the financial statements. She uses the `stage1`

object as an input for the `selection()`

function.

```
# Step 2: Draw a sample from the financial statements.
<- selection(data = BuildIt, size = stage1, units = "values", values = "bookValue", method = 'interval') stage2
```

Laura can inspect the outcomes of her sampling procedure by using the `summary()`

function.

`summary(stage2)`

```
##
## Audit Sample Selection Summary
##
## Options:
## Requested sample size: 178
## Sampling units: monetary units
## Method: fixed interval sampling
## Starting point: 1
##
## Data:
## Population size: 3500
## Population value: 1403221
## Selection interval: 7883.3
##
## Results:
## Selected sampling units: 178
## Proportion of value: 0.0001269
## Selected items: 178
## Proportion of size: 0.050857
```

The selected sample can be isolated by indexing the `sample`

object from the sampling result. Now Laura can execute her audit by annotating the sample with their audit value (for exampling by writing the sample to a *.csv* file using `write.csv()`

. She can then load her annotated sample back into R for further evaluation.

```
# Step 3: Isolate the sample for execution of the audit.
<- stage2$sample
sample
# To write the sample to a .csv file:
# write.csv(x = sample, file = "auditSample.csv", row.names = FALSE)
# To load annotated sample back into R:
# sample <- read.csv(file = "auditSample.csv")
```

For this example, the audit values of the sample are already included in the `auditValue`

column of the data set .

Using her annotated sample, Laura can perform her inference with the `evaluation()`

function.

```
# Step 4: Evaluate the sample
<- evaluation(materiality = materiality, conf.level = c.adj, data = sample,
stage4 values = 'bookValue', values.audit = 'auditValue')
```

Laura can inspect the outcomes of her inference by using the `summary()`

function. Her resulting 91.66% upper bound is 1.396%, which is lower than the materiality of 5%. The output tells Laura the correct conclusion immediately.

`summary(stage4)`

```
##
## Classical Audit Sample Evaluation Summary
##
## Options:
## Confidence level: 0.91667
## Materiality: 0.05
## Materiality: 0.05
## Hypotheses: H0: T >= 0.05 vs. H1: T < 0.05
## Method: poisson
##
## Data:
## Sample size: 178
## Number of errors: 0
## Sum of taints: 0
##
## Results:
## Most likely error: 0
## 91.66667 percent confidence interval: [0, 0.01396]
## Precision: 0.01396
## p-value: 0.00013639
```

Since the 91.66% upper confidence bound on the misstatement in population is lower than the performance materiality Laura has obtained sufficient evidence to conclude that there is less than 5% risk that the population contains material misstatement.