Aggregate Correspondence Table

Overview

This vignette illustrates how to aggregate numeric values across classification systems using the aggregateCorrespondenceTable() function from the correspondenceTables package. The function aggregates numeric values expressed in a source classification (A) into a target classification (B), using a correspondence table that links A to B (denoted A → B). When correspondence weights are available, values are redistributed proportionally according to these weights. If no weights are provided, values are distributed equally across all corresponding target codes.

This type of aggregation is commonly used to convert statistics between classification systems, for example:

NACE → CPA

CPA → CN

PRODCOM → CPA

CPC → HS

library(correspondenceTables)

Inputs

The aggregateCorrespondenceTable() function expects the following inputs:

Application of aggregateCorrespondenceTable()

Example 1: Basic aggregation using a correspondence table

In this example, all inputs are read from sample datasets included in the package.

AB_path <- system.file("extdata/test", "ab_data.csv", package = "correspondenceTables")
A_path  <- system.file("extdata/test", "a_data.csv",  package = "correspondenceTables")
B_path  <- system.file("extdata/test", "b_data.csv",  package = "correspondenceTables")

stopifnot(nzchar(AB_path), nzchar(A_path), nzchar(B_path))

AB <- utils::read.csv(AB_path, stringsAsFactors = FALSE)
A  <- utils::read.csv(A_path,  stringsAsFactors = FALSE)
B  <- utils::read.csv(B_path,  stringsAsFactors = FALSE)

#For clarity and consistency, the correspondence table columns are renamed to the expected identifiers:

names(AB)[names(AB) == "NACE.Rev..2.Code"]   <- "from_code"
names(AB)[names(AB) == "NACE.Rev..2.1.Code"] <- "to_code"


res <- aggregateCorrespondenceTable(AB = AB, A = A, B = B)


knitr::kable(
  head(res$result),
  caption = "Aggregation using a correspondence table",
  align = "c"
)
Aggregation using a correspondence table
code_B Level Superior value
12 2 C 2.0
31 2 C 16.0
36 2 E 2.0
37 2 E 2.0
39 2 E 2.0
41 2 F 1.5

The function returns a list. The aggregated values are stored in the result element, which is a data frame structured according to the target classification B.

Interpretation of the output

In this example:

  • Dataset A contains numeric values expressed in the source classification.
  • The correspondence table AB specifies how each source code is linked to one or more target codes.
  • No weights are supplied in the correspondence table.

For each source code in A:

  • If it maps to a single target code, its full value is assigned to that target code.
  • If it maps to multiple target codes, its value is split equally among them.

All allocated contributions are then summed for each target code. The column containing numeric values in the output therefore represents the total value aggregated to each target classification code in B.

Notes

  • The aggregation performed by aggregateCorrespondenceTable() is additive: values are redistributed and summed, not averaged or otherwise summarized.
  • Supplying the B argument ensures that the output covers the full target classification domain; target codes with no matching contributions receive a value of zero.

Example 2: Weighted correspondence (proportional allocation)

This example illustrates aggregation when the correspondence table includes explicit weights.

Here:

  • Source code A1 is linked to two target codes:
    • 70% of its value goes to B1
    • 30% goes to B2
  • Source code A2 is linked entirely to B2

The function multiplies each source value by the corresponding weight for each correspondence link and then sums all weighted contributions per target code.

# Correspondence table with weights  
AB <- data.frame(
  from_code = c("A1", "A1", "A2"),
  to_code   = c("B1", "B2", "B2"),
  weight    = c(0.7, 0.3, 1.0)
)

# Source classification with values  
A <- data.frame(
  code  = c("A1", "A2"), 
  value = c(100, 50)
)

# Target classification domain
B <- data.frame(
  code = c("B1", "B2")
)

res2 <- aggregateCorrespondenceTable(AB = AB, A = A, B = B)

knitr::kable(
  head(res2$result),
  caption = "Weighted correspondence (proportional allocation)",
  align = "c"
)
Weighted correspondence (proportional allocation)
code_B value
B1 70
B2 80

Interpretation of the output

The values shown in the output represent the total weighted sums per target code.

For example:

  • Target code B1 receives 70% of the value associated with A1
  • Target code B2 receives:
    • 30% of A1
    • 100% of A2

All contributions are summed to produce the final totals.

Tiny numeric illustration

If A1 has a value of 100:

  • 70 is allocated to B1
  • 30 is allocated to B2

If A2 has a value of 50 and maps fully to B2, the final value for B2 is:

\(30 + 50 = 80\)