--- title: "Correlations" author: "Aaron R. Caldwell" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: toc: TRUE editor_options: chunk_output_type: console bibliography: references.bib vignette: > %\VignetteIndexEntry{Correlations} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} --- The TOSTER package provides several functions for calculating and analyzing correlations. These functions extend beyond traditional correlation tests by offering equivalence testing capabilities and robust correlation methods. The included functions are based on research by @goertzen2010 (`z_cor_test` & `compare_cor`), and @wilcox2011introduction (`boot_cor_test`)^[The bootstrapped functions were adapted from code posted by Rand Wilcox on his website, with modifications inspired by Guillaume Rousselet's `bootcorci` R package, available on GitHub: https://github.com/GRousselet]. # Simple Correlation Test Basic tests of association can be performed with the `z_cor_test` function. This function is styled after R's built-in `cor.test` function but uses Fisher's z transformation as the basis for all significance tests (p-values). Despite this difference in methodology, the confidence intervals are typically very similar to those produced by `cor.test`. ```{r} library(TOSTER) # Base R correlation test cor.test(mtcars$mpg, mtcars$qsec) # TOSTER's z-transformed correlation test z_cor_test(mtcars$mpg, mtcars$qsec) ``` Like `cor.test`, the `z_cor_test` function supports Spearman and Kendall correlation coefficients: ```{r} # Spearman correlation z_cor_test(mtcars$mpg, mtcars$qsec, method = "spear") # Short form accepted; "spearman" also works # Kendall correlation z_cor_test(mtcars$mpg, mtcars$qsec, method = "kendall") ``` # Advantages of z_cor_test The main advantage of `z_cor_test` over the standard `cor.test` is its ability to perform equivalence testing (TOST) or any hypothesis test where the null hypothesis isn't zero. This makes it particularly useful for research questions focused on demonstrating practical equivalence or testing against specific correlation thresholds. ```{r} # Equivalence test with null boundary of 0.4 z_cor_test(mtcars$mpg, mtcars$qsec, alternative = "e", # e for equivalence null = .4) ``` In this example, we're testing whether the correlation is equivalent to zero within the boundaries of ±0.4. # Using Summary Statistics A key advantage of TOSTER is the ability to perform correlation tests using only summary statistics, which is particularly useful when reviewing published literature or working with limited data access. The `corsum_test` function enables this functionality: ```{r} # Testing a correlation of 0.121 from a sample of 105 paired observations corsum_test(r = .121, n = 105, alternative = "e", null = .4) ``` This example tests whether a correlation of 0.121 from a sample of 105 paired observations is equivalent to zero within the boundaries of ±0.4. # Bootstrapped Correlation Test For more robust analyses when raw data is available, TOSTER provides the `boot_cor_test` function. This bootstrapping approach generally produces more reliable results than Fisher's z-based tests, especially when outliers are present or distribution assumptions are violated. ```{r} set.seed(993) # Setting seed for reproducibility boot_cor_test(mtcars$mpg, mtcars$qsec, alternative = "e", null = .4) # Bootstrapped Spearman correlation boot_cor_test(mtcars$mpg, mtcars$qsec, method = "spear", alternative = "e", null = .4) # Bootstrapped Kendall correlation boot_cor_test(mtcars$mpg, mtcars$qsec, method = "ken", # Short form accepted alternative = "e", null = .4) ``` ## Robust Correlation Methods The `boot_cor_test` function also provides access to robust correlation methods that are less sensitive to outliers and violations of normality: ```{r} # Winsorized correlation with 10% trimming boot_cor_test(mtcars$mpg, mtcars$qsec, method = "win", alternative = "e", null = .4, tr = .1) # Set trim amount (default is 0.2) # Percentage bend correlation boot_cor_test(mtcars$mpg, mtcars$qsec, method = "bend", alternative = "e", null = .4, beta = .15) # Beta parameter controlling resistance to outliers ``` The Winsorized correlation reduces the impact of outliers by replacing extreme values with less extreme values. The percentage bend correlation is another robust method that downweights the influence of outliers in the calculation. # Comparing Correlations TOSTER provides tools for comparing correlations between independent groups or studies. This is useful for testing differences in relationships across populations or for evaluating replication studies. ## Summary Statistics Approach When only summary statistics are available, the `compare_cor` function can be used: ```{r} # Comparing correlation r1=0.8 from n=40 with r2=0.2 from n=100 compare_cor(r1 = .8, df1 = 38, # df = n-2 r2 = .2, df2 = 98) # df = n-2 ``` The `compare_cor` function supports different methods for comparing correlations: ```{r} # Testing equivalence using Fisher's method compare_cor(r1 = .8, df1 = 38, r2 = .2, df2 = 98, null = .2, method = "f", # Fisher (can also use "fisher") alternative = "e") # Equivalence test ``` Available methods include: * **Fisher's z transformation** (`method = "fisher"` or `"f"`): Tests the difference between correlations on the z-transformed scale. This is generally recommended for most applications. * **Kraatz's method** (`method = "kraatz"` or `"k"`): Directly measures the difference between correlation coefficients. While both methods are appropriate for general significance testing, they may have limited statistical power in some scenarios [@counsell2015equ]. ## Bootstrapped Comparison When raw data is available for both correlations, the `boot_compare_cor` function offers a more robust approach through bootstrapping: ```{r} set.seed(8922) # Setting seed for reproducibility # Generating example data x1 = rnorm(40) y1 = rnorm(40) x2 = rnorm(100) y2 = rnorm(100) # Bootstrap comparison with winsorized correlation boot_compare_cor( x1 = x1, x2 = x2, y1 = y1, y2 = y2, null = .2, alternative = "e", # Equivalence test method = "win" # Winsorized correlation ) ``` This approach has several advantages: - It does not rely on the Fisher's z-transformation approximation - It can incorporate robust correlation methods - It can provide more accurate confidence intervals, especially when typical assumptions are violated # Practical Recommendations When choosing which correlation method to use in TOSTER: 1. **If raw data is available:** - For most cases, use `boot_cor_test` with Pearson, Spearman, or Kendall methods - When outliers or distribution assumptions are concerns, consider the robust methods (winsorized or percentage bend) 2. **If only summary statistics are available:** - Use `corsum_test` for single correlation analysis - Use `compare_cor` with the Fisher method for comparing correlations 3. **For equivalence testing:** - Carefully select meaningful boundaries (null values) based on your research context - Consider what effect size would be practically insignificant in your field # Advanced Usage ## Custom Bootstrap Methods The bootstrapped functions in TOSTER allow customization of the bootstrap procedure: ```{r, eval=FALSE} # Customizing the bootstrap procedure boot_cor_test( x = mtcars$mpg, y = mtcars$qsec, method = "pearson", R = 2000, # Increasing number of bootstrap samples alpha = 0.01, # Using 99% confidence interval alternative = "t" # Two-sided test ) ``` ## Working with Missing Data By default, the correlation functions in TOSTER use pairwise complete observations: ```{r, eval=FALSE} # Example with missing data x_with_na <- c(mtcars$mpg, NA, NA) y_with_na <- c(mtcars$qsec, 10, NA) # Default behavior handles NAs with pairwise deletion z_cor_test(x_with_na, y_with_na) ``` # References