The Sinh-Arcsinh (shash) distribution is very flexible and can adjust to a wide range of distributional shapes. This makes it particularly useful for modeling psychometric data, which often exhibit non-normal characteristics such as skewness and heavy tails. Unlike discrete distributions (e.g., beta-binomial), shash naturally handles continuous scores, decimal values, and unbounded measures. It can accommodate scores of zero and negative values without transformation, unlike Box-Cox approaches that require strictly positive data. Finally, skewness (ε) and tail weight (δ) can be controlled independently, allowing precise modeling of distributional characteristics. This makes it possible to model complex developmental patterns where skewness and tail weight change with age.

Mathematical Foundation

The Sinh-Arcsinh (shash; Jones & Pewsey, 2009) distribution is defined by a transformation of a standard normal variable. If \(Y\) follows a standard normal distribution, \(Y \sim N(0,1)\), then the shash-distributed variable \(X\) is generated by:

\[X = \mu + \sigma \cdot \sinh\left(\frac{\text{arcsinh}(Y) - \epsilon}{\delta}\right)\]

This transformation allows the resulting variable \(X\) to have its location, scale, skewness, and tail weight controlled by the four parameters:

μ (mu): location parameter, shifts the distribution horizontally (similar to mean)
σ (sigma): scale parameter (\(\sigma > 0\)), controls the spread of the distribution (similar to standard deviation)
ε (epsilon): skewness parameter (\(\epsilon = 0\) for symmetry, \(\epsilon > 0\) for right skew, \(\epsilon < 0\) for left skew)
δ (delta): tail weight parameter (\(\delta = 1\) produces normal-like tails, \(\delta > 1\) produces heavier tails, \(\delta < 1\) produces lighter tails)

The probability density function involves:

\[f(x|\mu,\sigma,\epsilon,\delta) = \frac{\delta}{\sigma\sqrt{2\pi}} \cdot \frac{\cosh(\delta \cdot \text{arcsinh}(z) + \epsilon)}{\sqrt{1 + z^2}} \cdot \exp\left(-\frac{1}{2}[\sinh(\delta \cdot \text{arcsinh}(z) + \epsilon)]^2\right)\]

where \(z = \frac{x - \mu}{\sigma}\). The cumulative distribution function (CDF) does not have a closed-form expression but can be computed numerically. For a given value \(x\), the CDF is:

\[F(x|\mu,\sigma,\epsilon,\delta) = P(X \leq x) = \Phi[\sinh(\delta \cdot \text{arcsinh}(z) + \epsilon)]\]

where \(\Phi\) is the standard normal CDF and \(z = (x - \mu)/\sigma\). The quantile function (inverse CDF) can be expressed as:

\[Q(p|\mu,\sigma,\epsilon,\delta) = \mu + \sigma \cdot \sinh\left(\frac{\text{arcsinh}(\Phi^{-1}(p)) - \epsilon}{\delta}\right)\]

Modeling the Sinh-Arcsinh over Age

In cNORM’s shash implementation, we model the four parameters as polynomial functions of standardized age with mu_degree = 3, sigma_degree = 2, epsilon_degree = 2 and delta_degree = 1 as the default settings. It is advisable to keep the delta_degree parameter low, for example 1 or 2, to avoid overfitting. The tail weight parameter δ can as well be held constant by setting the ‘delta_degree’ parameter to NULL. In this case, a fixed delta (default = 1) will be used to reflect population characteristics, e. g. by increasing it to values delta > 1 for heterogenous samples or delta < 1 for homogenuous samples. Age is standardized as: \(\text{age}_{std} = \frac{\text{age} - \overline{\text{age}}}{\text{SD}(\text{age})}\) for numerical stability during optimization. The parameters are estimated using maximum likelihood estimation.

Application and Prerequisites

The shash distribution is particularly well-suited for continuous performance measures (reaction times, achievement scores with decimal precision), tests with floor or ceiling effects where skewness varies across age groups, heterogeneous populations with varying degrees of individual differences, change or difference scores that may include negative values, adaptive tests or measures with variable stopping rules and large-scale assessments, where distributional assumptions are complex. And of course, it is very suited for developmental studies, since the distribution shape can change systematically with age.

When applied to normative data, there are some prerequisites and considerations to ensure optimal model performance and valid results. The test scores should exhibit systematic (though not necessarily monotonic) development across the predictor variable. The shash model can capture complex non-linear relationships through polynomial fitting of the distribution parameters. The norming sample should include a sufficient sample size. It is usually much lower than in conventional norming. Nonetheless, as a rule of thumb, we recommend a minimum of 100 cases per major age group, with larger samples needed for higher polynomial degrees or complex developmental patterns. Despite empirical validation of this rule is ongoing, these sample sizes have proven effective in previous norming studies (Lenhard et al., 2019).

While shash handles continuous scores optimally, it can also model discrete scores effectively, especially when the number of possible values is large (e. g. value range > 20). As always, representativeness of the sample is important. If the sample deviates from the target population, though post-stratification weighting can help address moderate deviations from representativeness.

Model Selection Considerations

It is advisable to explore the norm data sample prior to modeling. The polynomial degrees for the four parameters should be selected based on theoretical expectations about developmental trajectories, sample size, and model comparison criteria (AIC, BIC, cross-validation). Visual inspection of fitted curves is essential to ensure realistic patterns. Overly complex models may lead to overfitting, especially with limited data. Thus, choose polynomial degrees based on:

Theoretical expectations about developmental trajectories
Sample size (higher degrees require more data to avoid overfitting)
Model comparison criteria (AIC, BIC, cross-validation)
Visual inspection of fitted curves for realistic patterns

When adjusting the δ parameter, you can either keep it constant or model it as a polynomial function of age using the delta_degree parameter. If the delta_degree parameter is used, keep it low (1 or 2, default = 1) to avoid overfitting. In that case, the default δ is not used. To explicitely keep delta constant, please set ‘delta_degree = NULL’. The delta parameter should reflect population characteristics:

δ = 0.7-0.9: Homogeneous populations, selected samples
δ = 1.0: General population samples with normal-like variability
δ = 1.2-2.0: Heterogeneous populations with high individual differences

Modeling Example

We demonstrate Sinh-Arcsinh (shash) modeling using the PPVT-4 vocabulary development dataset. Please have a look at the beta-binomial vignette for a comparison with discrete models.

Basic Model Fitting

# Fit shash model with custom settings
# The function automatically displays percentile plots
model.shash <- cnorm.shash(age = ppvt$age, score = ppvt$raw)



# Use print(model.shash), diagnostics(model.shash) or summary(model.shash)
# to retrieve information on the data fit.

The model uses reasonable default polynomial degrees per default and provides immediate visual feedback through percentile plots. The output includes fitted parameters, convergence information, and basic model statistics. Model diagnostics (summary, diagnostics) provides:

Model fit statistics (log-likelihood, AIC, BIC, R²)
Parameter estimates with standard errors and significance tests
Convergence diagnostics
Separate tables for location, scale, and skewness parameters

Please pay attention to convergence (should be successful, if not, please inspect model visually), R² (correlation between fitted and manifest percentiles; > 0.95 desirable) and general fit statistics.

Custom Model Specifications

For datasets with complex patterns, adjust polynomial degrees and distributional parameters:

# Conservative parameterization with fixed delta across age
model.simple <- cnorm.shash(age = ppvt$age, score = ppvt$raw,
                           mu_degree = 2,        # Quadratic location pattern
                           sigma_degree = 1,     # Linear variability change  
                           epsilon_degree = 1,   # Linear skewness change
                           delta_degree = NULL,  # deactivates polynimial fitting for delta
                           delta = 1.1)          # Slightly heavy tails,
                                                 # kept constant across age

# Example with more complex parameterization
model.complex <- cnorm.shash(age = ppvt$age, score = ppvt$raw,
                           mu_degree = 4,        # Quadric pattern
                           sigma_degree = 3,     # Complex variability changes  
                           epsilon_degree = 2,   # Quadratic age-varying skewness
                           delta_degree = 2)     # Changing tail weights across age (quadratic)

# Compare models
compare(model.simple, model.complex, age = ppvt$age, score = ppvt$raw,
        title = "ShaSh Model Comparison")

Note: Higher polynomial degrees increase model flexibility but may lead to overfitting with insufficient data. Always validate complex models through visual inspection and preferable cross-validation.

Post-Stratification and Weighting

Like betabinomial and distribution free models, cNORM supports post stratification in shash distributions via iterative post stratification to approximate representativity:

# Calculate post-stratification weights
margins <- data.frame(variables = c("sex", "sex", "migration", "migration"),
                     levels = c(1, 2, 0, 1),
                     share = c(.52, .48, .7, .3))

weights <- computeWeights(ppvt, margins)

# Fit weighted ShaSh model
model.weighted <- cnorm.shash(ppvt$age, ppvt$raw, weights = weights)

# Compare weighted vs. unweighted
compare(model.shash, model.weighted, age = ppvt$age, score = ppvt$raw,
        title = "Unweighted vs. Weighted ShaSh Models")

Norm Score Generation

Norm scores can be calculated for individual cases (or vectors of cases) or as comprehensive norm tables for specified ages.

# Individual Norm Score Prediction:
# Generate norm scores for specific age-score combinations
ages <- c(10.25, 10.75, 11.25, 11.75)
raw_scores <- c(180, 185, 190, 195)

norm_scores <- predict(model.shash, ages, raw_scores)
prediction_table <- data.frame(
  Age = ages, 
  Raw_Score = raw_scores, 
  Norm_Score = round(norm_scores, 1)
)
print(prediction_table)


# Norm Score tables:
# Generate detailed norm tables for multiple ages
tables <- normTable.shash(model.shash, 
                         ages = c(10.25, 10.75), 
                         start = 150, 
                         end = 220,
                         step = 1,
                         CI = 0.95, 
                         reliability = 0.94)

# Display head from first table
head(tables[[1]], 10)

The norm tables provide:

x: Raw scores
Px: Probability density values
Pcum: Cumulative probabilities
Percentile: Percentile ranks (0-100)
z: Standardized z-scores
norm: Norm scores in specified scale
Confidence intervals: When reliability is specified

Model Comparison and Selection

Comparing Distributional Approaches

# Compare shash with betabinomial models (BB). BB models should work worse,
# since the test has stop rules, leading to non-binomial distributions.
model.bb <- cnorm.betabinomial(ppvt$age, ppvt$raw, n = 228, plot = FALSE)

# Model comparisons
compare(model.shash, model.bb, age = ppvt$group, score = ppvt$raw,
        title = "SinH-ArcSinH vs. Beta-Binomial")
#> 
#> Model Comparison Summary:
#> ------------------------
#>  Metric     Model1     Model2 Difference
#>      R2     0.9831     0.9578    -0.0253
#>    Bias     0.0416     0.1190     0.0774
#>    RMSE     1.2987     2.0583     0.7596
#>     MAD     0.9968     1.5524     0.5556
#>     AIC 39275.1768 39856.6454   581.4686
#>     BIC 39352.2303 39908.0144   555.7841
#> 
#> Note: Difference = Model2 - Model1
#>       Fit indices are based on the manifest and fitted norm scores of both models.
#>       Scale metrics are T scores (scaleSD = 10)
#>       AIC and BIC should only be used when comparing models of the same type.



# Compare distribution free Taylor model
model.taylor <- cnorm(group = ppvt$group, raw = ppvt$raw, plot=FALSE)

# Model comparisons shash versus taylor
compare(model.shash, model.taylor, age = ppvt$group, score = ppvt$raw,
        title = "SinH-ArcSinH vs. Taylor")
#> Retrieving norm scores, please stand by ...
#> 
#> Model Comparison Summary:
#> ------------------------
#>  Metric     Model1      Model2  Difference
#>      R2     0.9831      0.9847      0.0016
#>    Bias     0.0416      0.0376     -0.0041
#>    RMSE     1.2987      1.2411     -0.0576
#>     MAD     0.9968      0.9633     -0.0335
#>     AIC 39275.1768  24320.8609 -14954.3159
#>     BIC 39352.2303 -21528.1994 -60880.4297
#> 
#> Note: Difference = Model2 - Model1
#>       Fit indices are based on the manifest and fitted norm scores of both models.
#>       Scale metrics are T scores (scaleSD = 10)
#>       AIC and BIC should only be used when comparing models of the same type.

Decision Framework

The different continuous models have advantages for specific use cases, with the distribution free approach (Taylor polynomials) being the most flexible, the betabinomial approach being optimal for discrete item counts (e.g. 1PL IRT models), and shash being ideal for continuous scores with complex distributional shapes.

Use shash models for

continuous scores with decimal precision
complex skewness patterns that change across age
floor or ceiling effects requiring flexible shape modeling
zero or negative scores are present

Use Beta-Binomial models for

discrete item counts from binary scoring
fixed maximum score (number of test items)
unspeeded tests
1PL IRT-based psychometric instruments
small to moderate number of possible scores

Use distribution-free (Taylor Polynomial) models when

maximum flexibility is paramount
no strong distributional assumptions can be made
quick implementation is needed and manual adjustment is desired

In selecting models, please compare models using:

Information criteria: AIC, BIC (lower is better)
Fit statistics: R², RMSE, bias
Visual inspection: Smoothness, realism of percentile curves
Cross-validation: Out-of-sample prediction accuracy
Theoretical appropriateness: Match between model assumptions and data characteristics

Modelling Norms with the Sinh-Arcsinh (shash) Distribution

Wolfgang Lenhard & Alexandra Lenhard

2025-10-14