--- title: "Ada-Plot and Uda-Plot" author: Uditha Amarananda Wijesuriya bibliography: MyBib.bib link-citations: true output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Ada-Plot and Uda-Plot} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Introduction

As alternatives for Ad-plot and Ud-plot, two novel statistical plots, Ada-plot and Uda-plot derived from the empirical centralized cumulative average deviation function (eccadf), $C_n(t)$ invented by author, will be illustrated with examples [@Uditha1;-@Uditha2; -@Uditha3]. Suppose that $X_1,X_2,...,X_n$ is a random sample from a unimodal distribution. Then for a real number $t$, the eccadf computes the sum of the deviations of the data that locate $t$ distance from the sample average, divided by the sample size $n$. The Ada-plot detects critical properties of the distribution such as symmetry, skewness, and outliers of the data. Next, the Uda-plot derived from 3 theorems by author is prominent on assessing normality [@Uditha1]. The extreme values in the data could be removed using 1.5IQR rule prior to create the robust version of the Uda-plot. To visualize two innovative plots, let _adaplots_ package be installed, including _stats_ and _ggplot2_ in R.

```{r setup} library(ggplot2) library(adaplots) ``` ### Ada-Plot

The first exhibit, Ada-plot consists of $n$ ordered pairs of the form $(x_i,A_n(|x_i-\bar{X}|))$ for $i=1,2,...,n$, where

$A_n(t)=\min\limits_{t} C_n(t)+\max\limits_{t} C_n(t)-C_n(t)$

#### Example 1 ```{r, eval=TRUE, fig.width=7.18, fig.height=4.5} set.seed(2025) X<-matrix(rnorm(100, mean = 2 , sd = 5)) adaplot(X, title = "Ada-plot", xlab = "x", lcol = "black", rcol = "grey60") ```

Figure 1. The points in Ada-plot are evenly distributed around the sample average and hence, it indicates the symmetric property of the distribution. Further, the points in both lower and upper ends appear to be potentially outliers.

#### Example 2 ```{r, eval=TRUE, fig.width=7.18, fig.height=4.5} set.seed(2025) X<-matrix(rbeta(100, shape1 = 10, shape2 = 2)) adaplot(X, title = "Ada-plot", xlab = "x", lcol = "black", rcol = "grey60") ```

Figure 2. The points below the average in Ada-plot widely spread. Thus, the distribution is apparently left skewed.

#### Example 3 ```{r, eval=TRUE, fig.width=7.18, fig.height=4.5} set.seed(2025) X<-matrix(rf(100, df1 = 10, df2 = 5)) adaplot(X, title = "Ada-plot", xlab = "x", lcol = "black", rcol = "grey60") ```

Figure 3. The points situated above the average in Ada-plot widely dispersed contrary to Figure 2. Thus, the distribution is apparently right-skewed.

### Uda-Plot

Suppose that the random sample is from a normal distribution with mean $\mu$ and variance $\sigma^2$. Then the second illustration, Uda-plot is defined by $n$ ordered pairs of the form $(x_i,W_n(x_i))$ for $i=1,2,...,n$, where $s^2$ is the sample variance and $W_n(t)$ is as in the article by @Uditha1.

#### Example 4 ```{r, eval=TRUE, fig.width=7.18, fig.height=4.5} set.seed(2030) X<-matrix(rnorm(30, mean = 2, sd = 5)) udaplot(X, npdf = FALSE, lcol = "black", rcol = "grey60", pdfcol = "red") ```

Figure 4. The evenly distributed points about the sample average in Uda-plot follow a bell-shaped curve. Thus, it captures the symmetric property of the distribution.

#### Example 5 ```{r, eval=TRUE, fig.width=7.18, fig.height=4.5} set.seed(2030) X<-matrix(rnorm(30, mean = 2, sd = 5)) udaplot(X, npdf = TRUE, lcol = "black", rcol = "grey60", pdfcol = "red") ```

Figure 5. The points in the Uda-plot tightly situate around the estimated normal density curve, indicating that the data follow the normal distribution with mean $\mu$ and variance $\sigma^2$ being estimated by the sample average $\bar{X}$ and variance $s^2$, respectively. Further, the $d$-value measures the degree of proximity of Uda-plot to the estimated normal density curve.

#### Example 6 ```{r, eval=TRUE, fig.width=7.18, fig.height=4.5} set.seed(2030) X<-matrix(rnorm(2050, mean = 2, sd = 5)) udaplot(X, npdf = TRUE, lcol = "black", rcol = "grey60", pdfcol = "red") ```

Figure 6. Uda-plot is indistinguishable from the estimated normal density curve as sample size increases with a higher degree of proximity.

### Robust version of Uda-Plot

The empirical centralized cumulative average deviation function consists of $|x_i-\bar{X}|$. Also, $W_n(t)$ depends on $C_n(t)$. Since the sample average is sensitive for outliers, eliminating extreme values and then creating Uda-plot make the robust version. User can opt in _TRUE_ for _excld_ option in _udaplot_ function, to exclude extreme values using the $1.5IQR$ rule. Herein $IQR$ stands for interquartile range.

#### Example 7 ```{r, eval=TRUE, fig.width=7.18, fig.height=4.5} set.seed(2030) X<-matrix(c(rnorm(50, mean = 2, sd = 5), runif(4, 17, 30))) udaplot(X, npdf = TRUE, lcol = "black", rcol = "grey60", pdfcol = "red") ```

Figure 7. Uda-plot for 50 simulated data from $N(2,5)$ contaminated with four simulated from $Uniform(17,30)$, follows most part of the desirable normal density. Notably, the four points in the upper end overestimate the normal curve.

#### Example 8 ```{r, eval=TRUE, fig.width=7.18, fig.height=4.5} set.seed(2030) X<-matrix(c(rnorm(50, mean = 2, sd = 5), runif(4, 17, 30))) udaplot(X, excld = TRUE, npdf = TRUE, lcol = "black", rcol = "grey60", pdfcol = "red") ```

Figure 8. Uda-plot for the data in Figure 7 together with $1.5IQR$ rule, indicates a close fit with the estimated normal density, improving the robustness. Herein $d$-value increases from 0.90 to 0.93, confirming normality.

## Reference