Do you have a Norwegian data set with codes for Standard Industrial Classification that you want to find out what they mean? Or data with Norwegian municipality numbers and no names? Or perhaps you want to convert English standard occupations into Ny Norsk for a figure. These are tasks which the R package klassR can help you with.
Statistics Norway’s KLASS is a central database of classifications and code lists. An API makes it easy to fetch these standards in different computing environments. KlassR provides an easy interface to fetch and apply these in R.
For Statistic Norway employees, the package is installed on most of our platforms. For others, it can be installed form CRAN with
install.packages("klassR")
CRAN is R’s central repository for thousands of useful packages. More information on the requirements for klassR can be found on CRAN
To use the function in klassR the package must be called each time a new R session is started. This can be done using
library(klassR)
To fetch a classification from KLASS you need the unique classification number. This can be found in the URL of the KLASS website or you can search for it in R using one of the following functions.
The function ListKlass
will fetch a list of all
classifications. It returns the classification name
(klass_name
), number (klass_nr
) and the
classification family it belongs to (klass_family
). The
classification type (klass_type
) is also shown which
indicates whether it is a classification or code list.
ListKlass()
klass_name | klass_nr | klass_family | klass_type |
---|---|---|---|
Standard for yrkesklassifisering | 7 | 1 | Klassifikasjon |
Standard for skadeforsikring bransje | 155 | 2 | Klassifikasjon |
Standard for kjønn | 2 | 3 | Klassifikasjon |
Standard for gruppering av familier | 17 | 3 | Klassifikasjon |
Standard for sivilstand | 19 | 3 | Klassifikasjon |
Standard for gruppering av husholdninger | 37 | 3 | Klassifikasjon |
Code lists are classifications that used for national and internal
(Statistics Norway) publications. These can be included in the list
using the codelist
parameter
ListKlass(codelists = TRUE)
klass_name | klass_nr | klass_family | klass_type |
---|---|---|---|
Standard for yrkesklassifisering | 7 | 1 | Klassifikasjon |
Kodeliste for yrkeskatalogen, basert på STYRK 98 | 145 | 1 | Kodeliste |
Kodeliste for arbeidstid (hel-/deltid) | 149 | 1 | Kodeliste |
Kodeliste for arbeidsmarkedsstatus | 161 | 1 | Kodeliste |
Kodeliste for arbeidsgiveravgiftstype | 162 | 1 | Kodeliste |
Kodeliste for delpopulasjon for lønn og sysselsetting | 163 | 1 | Kodeliste |
You can also search for a classification by a keyword using the
SearchKlass
function. The first parameter here is the query
to search for.
SearchKlass(query = "ARENA")
klass_name | klass_nr |
---|---|
Classification of type of building /cadastre | 31 |
Classification of land use and land cover | 118 |
Again, to include code lists in the search this should be specified
SearchKlass(query = "ARENA", codelists = TRUE)
klass_name | klass_nr |
---|---|
Kodeliste for ARENA as_ytelse | 394 |
Kodeliste for ARENA Tiltak | 386 |
Kodeliste for ARENA as_f (arbeidssøkerstatus_fingruppe ) | 396 |
Kodeliste for ARENA as_gr (arbeidssøkerstatus grovgruppe) | 395 |
Kodeliste for ARENA as_stat (arbeidssøkerstatus, aktivitet og ytelse) | 393 |
Classification of type of building /cadastre | 31 |
Classification of land use and land cover | 118 |
Sometimes a classification or code list will appear several times. This is due to that it occurs several times in different langauges in the database.
To fetch a complete classification, use the GetKlass
function together with the unique identifier. For example, to fetch the
Standard Industrial Classifications (KLASS number 6) we
run:
GetKlass(6)
code | parentCode | level | name |
---|---|---|---|
01 | A | 2 | Jordbruk og tjenester tilknyttet jordbruk, jakt og viltstell |
01.1 | 01 | 3 | Dyrking av ettårige vekster |
01.11 | 01.1 | 4 | Dyrking av korn (unntatt ris), belgvekster og oljeholdige vekster |
01.110 | 01.11 | 5 | Dyrking av korn (unntatt ris), belgvekster og oljeholdige vekster |
01.12 | 01.1 | 4 | Dyrking av ris |
01.120 | 01.12 | 5 | Dyrking av ris |
Classifications are often organised in a heirachical way. In the
example above, the Standard Industrial Classifications have different
values for level. To fetch a specific level, use the
output_level
parameter. For example, to fetch only the top
level Standard Industrial Classification codes we use:
GetKlass(6, output_level = 1)
code | parentCode | level | name |
---|---|---|---|
A | NA | 1 | Jordbruk, skogbruk og fiske |
B | NA | 1 | Bergverksdrift og utvinning |
C | NA | 1 | Industri |
D | NA | 1 | Elektrisitets-, gass-, damp- og varmtvannsforsyning |
E | NA | 1 | Vannforsyning, avløps- og renovasjonsvirksomhet |
F | NA | 1 | Bygge- og anleggsvirksomhet |
In the above examples we have seen that the names are returned in
Norwegian (Bokmål). However, many of the classification in
KLASS are in multiple languages. The output language can be
specified as Bokmål (“nb”), Nynorsk (“nn”) or English (“en”) using the
language
parameter. Note: all 3 languages are not
available for all classifcations.
GetKlass(6, output_level = 1, language = "en")
code | parentCode | level | name |
---|---|---|---|
A | NA | 1 | Agriculture, forestry and fishing |
B | NA | 1 | Mining and quarrying |
C | NA | 1 | Manufacturing |
D | NA | 1 | Electricity, gas, steam and air conditioning supply |
E | NA | 1 | Water supply; sewerage, waste management and remediation activities |
F | NA | 1 | Construction |
If you have a data set and want to apply a classfication to a
variable this is possible to do with ApplyKlass
. This can
be used to get the name of a variable which is in code form for
example.
There is a built in test dataset in klassR called
klassdata
. It contains fictitious persons with sex,
education level, municipality numbers, industry classfication for
workplace and occupation.
data(klassdata)
head(klassdata)
ID | sex | education | kommune | kommune2 | nace5 | occupation |
---|---|---|---|---|---|---|
1 | 2 | 2799 | 0706 | 706 | 47710 | 5132 |
2 | 2 | 5620 | 1567 | 1567 | 86902 | NA |
3 | 1 | 4010 | 1903 | 1903 | 41200 | 4177 |
4 | 1 | 1799 | 1003 | 1003 | 84120 | 3114 |
5 | 2 | NA | 0806 | 806 | 87102 | 2411 |
6 | 1 | 5621 | 0301 | 301 | 88911 | 8141 |
We can use ApplyKlass
to create a variable for the
occupation names (classification number 7) for the persons based on the
codes. We specify the vector of codes as the first parameter followed by
the unique classfication number.
$kommune_names <- ApplyKlass(klassdata$kommune,
klassdataklass = 131)
head(klassdata)
ID | sex | education | kommune | kommune2 | nace5 | occupation | kommune_names |
---|---|---|---|---|---|---|---|
1 | 2 | 2799 | 0706 | 706 | 47710 | 5132 | Sandefjord |
2 | 2 | 5620 | 1567 | 1567 | 86902 | NA | Rindal |
3 | 1 | 4010 | 1903 | 1903 | 41200 | 4177 | Harstad |
4 | 1 | 1799 | 1003 | 1003 | 84120 | 3114 | Farsund |
5 | 2 | NA | 0806 | 806 | 87102 | 2411 | Skien |
6 | 1 | 5621 | 0301 | 301 | 88911 | 8141 | Oslo |
Again, parameters including language
and
output_level
can be specified.
Classifications will often change over time. The KLASS
database considers this and older classifications can be fetched using
the date
parameter.
Fetching or using a classification at a specific time point can be
done using the date
parameter and specifying the date for
which the version of classification applies. The date format should be
in the form “yyyy-mm-dd”, for example “2022-05-27” for the 27th May,
2022.
There have been many changes to the regions in Norway (classification number 106) over the past few years. We can see this by fetching the classifications for these at different times
GetKlass(106, date = "2019-01-01")
code | parentCode | level | name |
---|---|---|---|
1 | NA | 1 | Oslo og Akershus |
2 | NA | 1 | Hedmark og Oppland |
3 | NA | 1 | Sør-Østlandet |
4 | NA | 1 | Agder og Rogaland |
5 | NA | 1 | Vestlandet |
6 | NA | 1 | Trøndelag |
7 | NA | 1 | Nord-Norge |
9 | NA | 1 | Uoppgitt |
GetKlass(106, date = "2020-01-01")
code | parentCode | level | name |
---|---|---|---|
1 | NA | 1 | Oslo og Viken |
2 | NA | 1 | Innlandet |
3 | NA | 1 | Agder og Sør-Østlandet |
4 | NA | 1 | Vestlandet |
5 | NA | 1 | Trøndelag |
6 | NA | 1 | Nord-Norge |
9 | NA | 1 | Uoppgitt |
Sometime it may be useful to fetch all codes over a period of time.
We can do this by specifing two dates as a vector in the
date
paramter.
The following code fetched Norwegian regional codes between 1st January 2019 to the 1st January 2020. There are 26 different codes that show both old and newer names.
GetKlass(106, date = c("2019-01-01", "2020-01-01"))
code | parentCode | level | name |
---|---|---|---|
1 | NA | 1 | Oslo og Akershus |
1 | NA | 1 | Oslo og Viken |
2 | NA | 1 | Innlandet |
2 | NA | 1 | Hedmark og Oppland |
3 | NA | 1 | Agder og Sør-Østlandet |
3 | NA | 1 | Sør-Østlandet |
4 | NA | 1 | Vestlandet |
4 | NA | 1 | Agder og Rogaland |
5 | NA | 1 | Vestlandet |
5 | NA | 1 | Trøndelag |
6 | NA | 1 | Trøndelag |
6 | NA | 1 | Nord-Norge |
7 | NA | 1 | Nord-Norge |
9 | NA | 1 | Uoppgitt |
To fetch only the changes in a time period rather than all codes we
can specify correspond=TRUE
allong with the time interval
we are interested in.
GetKlass(106,
date = c("2020-01-01", "2019-01-01"),
correspond = TRUE)
sourceCode | sourceName | targetCode | targetName |
---|---|---|---|
NA | NA | 1 | Oslo og Akershus |
NA | NA | 3 | Sør-Østlandet |
NA | NA | 4 | Agder og Rogaland |
NA | NA | 5 | Vestlandet |
1 | Oslo og Viken | NA | NA |
2 | Innlandet | 2 | Hedmark og Oppland |
3 | Agder og Sør-Østlandet | NA | NA |
4 | Vestlandet | NA | NA |
5 | Trøndelag | 6 | Trøndelag |
6 | Nord-Norge | 7 | Nord-Norge |
The table returned is a correspondents in codes and/or names in the
time interval specified. The sourceCode
and
sourceName
refer to the original name and coding. The
targetCode
and targetName
refer to the newer
code and name. Notice there is not a simple 1:1 correspondence between
all of the regions. Here the municipality number would be needed to map
the changes more accurately.
In addition to small changes in time, some classifications will
change completely and a correspondence table is then defined within the
KLASS database. These can be fetched or applied using
GetKlass
and ApplyKlass
functions together
with the correspond
parameter which should give the unique
classification number to convert into.
To fetch a correspondence table between municipality codes (131) and greater regional codes (106) we can run:
GetKlass(131, correspond = 106)
sourceCode | sourceName | targetCode | targetName |
---|---|---|---|
0301 | Oslo | 1 | Oslo og Viken |
3001 | Halden | 1 | Oslo og Viken |
3002 | Moss | 1 | Oslo og Viken |
3003 | Sarpsborg | 1 | Oslo og Viken |
3004 | Fredrikstad | 1 | Oslo og Viken |
3005 | Drammen | 1 | Oslo og Viken |
We can apply this correspondence between municipality and region in
our example data set using ApplyKlass
.
$region <- ApplyKlass(klassdata$kommune,
klassdataklass = 131,
correspond = 106,
date = "2016-01-01")
klassdata
ID | sex | education | kommune | kommune2 | nace5 | occupation | kommune_names | region |
---|---|---|---|---|---|---|---|---|
1 | 2 | 2799 | 0706 | 706 | 47710 | 5132 | Sandefjord | Sør-Østlandet |
2 | 2 | 5620 | 1567 | 1567 | 86902 | NA | Rindal | Vestlandet |
3 | 1 | 4010 | 1903 | 1903 | 41200 | 4177 | Harstad | Nord-Norge |
4 | 1 | 1799 | 1003 | 1003 | 84120 | 3114 | Farsund | Agder og Rogaland |
5 | 2 | NA | 0806 | 806 | 87102 | 2411 | Skien | Sør-Østlandet |
6 | 1 | 5621 | 0301 | 301 | 88911 | 8141 | Oslo | Oslo og Akershus |