hR: Toolkit for Data Analytics in Human Resources

Dale Kube

2019-04-28

Transform and analyze workforce data in meaningful ways for human resources (HR) analytics. Two functions, ‘hierarchyLong’ and ‘hierarchyWide’, convert standard employee and supervisor relationship data into useful formats. A ‘workforcePlan’ app is available for simple workforce planning.

Install the package from CRAN by running the install.packages("hR") command.

hierarchyLong

The hierarchyLong function transforms a standard set of unique employee and supervisor identifiers (employee IDs, email addresses, etc.) into a long format that can be used to aggregate employee data by a particular line of leadership (i.e. include everyone who rolls up to Susan). The function returns a long data frame consisting of one row per employee for every supervisor above them, up to the top of the tree (i.e. the CEO in your organization). The levels represent the number of supervisors from the employee (starting with “1” for an employee’s direct supervisor).

ee = c("Dale@hR.com","Bob@hR.com","Julie@hR.com","Andrea@hR.com")
supv = c("Julie@hR.com","Julie@hR.com","Andrea@hR.com","Susan@hR.com")
df = hierarchyLong(ee,supv)
print(df)
#>        Employee Level    Supervisor
#> 1 Andrea@hR.com     1  Susan@hR.com
#> 2    Bob@hR.com     1  Julie@hR.com
#> 3    Bob@hR.com     2 Andrea@hR.com
#> 4    Bob@hR.com     3  Susan@hR.com
#> 5   Dale@hR.com     1  Julie@hR.com
#> 6   Dale@hR.com     2 Andrea@hR.com
#> 7   Dale@hR.com     3  Susan@hR.com
#> 8  Julie@hR.com     1 Andrea@hR.com
#> 9  Julie@hR.com     2  Susan@hR.com

# How many employees report up through Susan?
nrow(df[df$Supervisor=="Susan@hR.com",])
#> [1] 4

# Who reports up through Susan?
df[df$Supervisor=="Susan@hR.com",]
#>        Employee Level   Supervisor
#> 1 Andrea@hR.com     1 Susan@hR.com
#> 4    Bob@hR.com     3 Susan@hR.com
#> 7   Dale@hR.com     3 Susan@hR.com
#> 9  Julie@hR.com     2 Susan@hR.com

hierarchyWide

The hierarchyWide function transforms a standard set of unique employee and supervisor identifiers (employee IDs, email addresses, etc.) into a wide format that can be used to aggregate employee data by a particular line of leadership (i.e. include everyone who rolls up to Susan). The function returns a wide data frame with a column for every level in the hierarchy, starting from the top of the tree (i.e. “Supv1” is likely the CEO in your organization).

df = hierarchyWide(ee,supv)
print(df)
#>        Employee        Supv1         Supv2        Supv3
#> 1   Dale@hR.com Susan@hR.com Andrea@hR.com Julie@hR.com
#> 2    Bob@hR.com Susan@hR.com Andrea@hR.com Julie@hR.com
#> 3  Julie@hR.com Susan@hR.com Andrea@hR.com         <NA>
#> 4 Andrea@hR.com Susan@hR.com          <NA>         <NA>

# How many employees report up through Susan?
sum(df$Supv1=="Susan@hR.com",na.rm=T)
#> [1] 4

# Who reports up through Susan?
df[which(df$Supv1=="Susan@hR.com"),]
#>        Employee        Supv1         Supv2        Supv3
#> 1   Dale@hR.com Susan@hR.com Andrea@hR.com Julie@hR.com
#> 2    Bob@hR.com Susan@hR.com Andrea@hR.com Julie@hR.com
#> 3  Julie@hR.com Susan@hR.com Andrea@hR.com         <NA>
#> 4 Andrea@hR.com Susan@hR.com          <NA>         <NA>