decode.Rd
Translate coded values into meaningful plain text (or reversed).
code(y, keyvalue, verbose = TRUE) decode(x, ...) # S3 method for data.frame decode(x, ...) # S3 method for default decode(x, keyvalue, extra_functions = NULL, exact = FALSE, ...)
y | value to be coded (to be matched against the |
---|---|
keyvalue | either a name (as character string) of a package internal keyvalue object, or a user defined keyvalue object (see as.keyvalue). |
verbose | (only for |
x | object to decode. Either a key vector to be matched against the
|
... | ignored |
extra_functions | is a list of functions (or names of functions as character vector) to be applied to the decoded data after decoding (see section "extra_functions" below). |
exact | Should |
For default S3 method: A vector of the same length as
x
but with all cells decoded (or coded) to plain text (or code) as
character.
For S3 method for class 'data.frame': Data.frame x
is returned, possibly with some extra columns (names ending in
'_Beskrivning'), decoded from columns with names corresponding to attribute
standard_var_names
for keyvalue objects listed by
list_keyvalues()
.
See the vignette for a longer introduction to the package:
vignette("decoder")
If x
is a data.frame, all column names of
x
are matched to attribute standard_var_names
for all keyvalue
objects in the package (see list_keyvalues()
). If the column name is
a standard name used for a coding, the corresponding keyvalue object is used
to decode the column and to add an extra column to x
with its
original name with suffix _Beskrivning
. This is done for all
identified columns.
The relationship between the key and the value in a keyvalue object is either 1:1 or m:1. The mapping is straight forward for 1:1 but with m:1, different applications might require slightly different groupings of the keys. One solution is to have several versions of the keyvalue object. Another (which we prefer) is to use the same keyvalue object (if possible) but to call one or several extra function(s) to further process the result. These functions are either built in package functions that should be called by quoted names or user defined functions that can be called by either quoted or unquoted names (if available in the current environment). Note that the order of the functions could matter since they are called in turn (the output from the first function is passed as input to the second function etc).
Standard functions and how to use them:
To use with sjukvardsomrade:
kungalv2Fyrbodal
The default classification used in sjukvardsomrade is to make Kungalv a region of its own. Use this function if Kungalv should be included in Fyrbodal. See example section below.
kungalv2Storgoteborg
As kungalv2Fyrbodal
but
classifies Kungalv as a part of Storgoteborg.
real_names
Give the area names with correct Swedish spelling (including spaces). This is not as default due to compatibility reasons and because names with spaces must be back-ticked when referred to.
To use with region
short_region_names
Exclude the prefix 'Region' from the region names, hence 'Syd' instead of 'Region Syd' etcetera.
#> [1] "Man" "Man" "Kvinna" "Man" "Man" "Kvinna" "Man" "Man" #> [9] "Man" "Kvinna" "Kvinna" "Man" "Man" "Man" "Man" "Kvinna" #> [17] "Man" "Man" "Kvinna" "Kvinna"#> [1] "1" "1" "2" "1" "1" "2" "1" "1" "1" "2" "2" "1" "1" "1" "1" "2" "1" "1" "2" #> [20] "2"# Get a sample of Snomed-codes (in the real world we obviously avoid this step) ... snomed2 <- sample(decoder::snomed$key, 30, replace = TRUE) # ... then decode them: (snomed3 <- decode(snomed2, "snomed"))#> [1] "meningiom, blandat" #> [2] "Trofoblastisk tumör, malignt" #> [3] "Basaliomatös cancer" #> [4] "Retinoblastom i spontan regress" #> [5] "Kronisk myeloisk leukemi" #> [6] "Jättecellstumör grad III, malign (malignt osteoklastom)" #> [7] "Lentigo maligna-melanoma (infiltrerande)" #> [8] "Gangliogliom, anaplastiskt" #> [9] "Urotelialt carcinom, papillärt, invasivt" #> [10] "Osteosarkom" #> [11] "Tekom malignt" #> [12] "Annan lymfatisk leukemi, Akut lymfoblastleukemi av Burkitt's typ" #> [13] "Sezary's syndrom" #> [14] "Misstänkt perifert T-cellslymfom UNS" #> [15] "Misstänkt medullär cancer" #> [16] "Gamma heavy chain disease, Franklins sjukdom" #> [17] "Retinoblastom i spontan regress" #> [18] "desmoplastisk infantilt astrocytom" #> [19] "Urotelialt carcinom, papillärt, invasivt" #> [20] "Arrhenoblastom UNS" #> [21] "Huvudcellsadenom" #> [22] "Arrenoblastom malignt" #> [23] "Germinalcellsdysplasi" #> [24] "Carcinoid misstänkt" #> [25] "Medullomyoblastom" #> [26] "Misstänkt kystadenocarcinom" #> [27] "Serös papillär/cystisk Borderlinetumör" #> [28] "Misstänkt jättecellscancer" #> [29] "CB/CC follicular and diffuse" #> [30] "Urotelialt carcinom, papillärt, invasivt"# Health care regions can be defined in more than one way # By default Kungalv define a region of its own: set.seed(123456789) healtcare_areas_west <- sample(unlist(decoder::sjukvardsomrade), 100, replace = TRUE) (areas <- decode(healtcare_areas_west, "sjukvardsomrade"))#> Warning: transformed to match the keyvalue: Punctuations are removed. Only the first 4 characters are used.#> Warning: Some codes could not be translated (46 cells)#> [1] NA NA "Sodra_Alvsborg" "Sodra_Alvsborg" #> [5] "Fyrbodal" NA "Fyrbodal" "Fyrbodal" #> [9] "Skaraborg" "Fyrbodal" "Sodra_Alvsborg" "Fyrbodal" #> [13] "Fyrbodal" "Skaraborg" "Sodra_Alvsborg" "Fyrbodal" #> [17] NA "Kungalv" "Sodra_Alvsborg" NA #> [21] NA "Fyrbodal" "Skaraborg" "Fyrbodal" #> [25] "Skaraborg" NA NA "Skaraborg" #> [29] NA "Kungalv" "Fyrbodal" NA #> [33] NA NA "Skaraborg" "Skaraborg" #> [37] NA NA "Norra_Halland" NA #> [41] NA NA NA "Fyrbodal" #> [45] "Kungalv" NA "Fyrbodal" NA #> [49] "Skaraborg" "Sodra_Alvsborg" "Sodra_Alvsborg" "Skaraborg" #> [53] "Skaraborg" "Fyrbodal" NA NA #> [57] "Fyrbodal" NA NA "Fyrbodal" #> [61] "Sodra_Alvsborg" "Sodra_Alvsborg" "Sodra_Alvsborg" "Storgoteborg" #> [65] "Skaraborg" "Skaraborg" NA NA #> [69] "Fyrbodal" NA "Storgoteborg" "Sodra_Alvsborg" #> [73] "Sodra_Alvsborg" NA "Fyrbodal" NA #> [77] NA "Fyrbodal" NA NA #> [81] NA NA NA NA #> [85] "Norra_Halland" NA NA NA #> [89] NA NA NA "Norra_Halland" #> [93] NA NA "Sodra_Alvsborg" "Skaraborg" #> [97] NA NA "Fyrbodal" "Fyrbodal"table(areas)#> areas #> Fyrbodal Kungalv Norra_Halland Skaraborg Sodra_Alvsborg #> 20 3 3 13 13 #> Storgoteborg #> 2# But if we want Kungalv to be a part of Storgoteborg # (which is common practice for example with lung cancer data): (areas2 <- decode(healtcare_areas_west, "sjukvardsomrade", "kungalv2Storgoteborg"))#> Warning: transformed to match the keyvalue: Punctuations are removed. Only the first 4 characters are used.#> Warning: Some codes could not be translated (46 cells)#> [1] NA NA "Sodra_Alvsborg" "Sodra_Alvsborg" #> [5] "Fyrbodal" NA "Fyrbodal" "Fyrbodal" #> [9] "Skaraborg" "Fyrbodal" "Sodra_Alvsborg" "Fyrbodal" #> [13] "Fyrbodal" "Skaraborg" "Sodra_Alvsborg" "Fyrbodal" #> [17] NA "Storgoteborg" "Sodra_Alvsborg" NA #> [21] NA "Fyrbodal" "Skaraborg" "Fyrbodal" #> [25] "Skaraborg" NA NA "Skaraborg" #> [29] NA "Storgoteborg" "Fyrbodal" NA #> [33] NA NA "Skaraborg" "Skaraborg" #> [37] NA NA "Norra_Halland" NA #> [41] NA NA NA "Fyrbodal" #> [45] "Storgoteborg" NA "Fyrbodal" NA #> [49] "Skaraborg" "Sodra_Alvsborg" "Sodra_Alvsborg" "Skaraborg" #> [53] "Skaraborg" "Fyrbodal" NA NA #> [57] "Fyrbodal" NA NA "Fyrbodal" #> [61] "Sodra_Alvsborg" "Sodra_Alvsborg" "Sodra_Alvsborg" "Storgoteborg" #> [65] "Skaraborg" "Skaraborg" NA NA #> [69] "Fyrbodal" NA "Storgoteborg" "Sodra_Alvsborg" #> [73] "Sodra_Alvsborg" NA "Fyrbodal" NA #> [77] NA "Fyrbodal" NA NA #> [81] NA NA NA NA #> [85] "Norra_Halland" NA NA NA #> [89] NA NA NA "Norra_Halland" #> [93] NA NA "Sodra_Alvsborg" "Skaraborg" #> [97] NA NA "Fyrbodal" "Fyrbodal"table(areas2)#> areas2 #> Fyrbodal Norra_Halland Skaraborg Sodra_Alvsborg Storgoteborg #> 20 3 13 13 5# We can also combine several extra_functions if we for example # also want the area names with correct Swedish spelling. (areas3 <- decode(healtcare_areas_west, "sjukvardsomrade", c("kungalv2Storgoteborg", "real_names")))#> Warning: transformed to match the keyvalue: Punctuations are removed. Only the first 4 characters are used.#> Warning: Some codes could not be translated (46 cells)#> [1] NA NA "Södra Älvsborg" "Södra Älvsborg" #> [5] "Fyrbodal" NA "Fyrbodal" "Fyrbodal" #> [9] "Skaraborg" "Fyrbodal" "Södra Älvsborg" "Fyrbodal" #> [13] "Fyrbodal" "Skaraborg" "Södra Älvsborg" "Fyrbodal" #> [17] NA "Storgöteborg" "Södra Älvsborg" NA #> [21] NA "Fyrbodal" "Skaraborg" "Fyrbodal" #> [25] "Skaraborg" NA NA "Skaraborg" #> [29] NA "Storgöteborg" "Fyrbodal" NA #> [33] NA NA "Skaraborg" "Skaraborg" #> [37] NA NA "Norra Halland" NA #> [41] NA NA NA "Fyrbodal" #> [45] "Storgöteborg" NA "Fyrbodal" NA #> [49] "Skaraborg" "Södra Älvsborg" "Södra Älvsborg" "Skaraborg" #> [53] "Skaraborg" "Fyrbodal" NA NA #> [57] "Fyrbodal" NA NA "Fyrbodal" #> [61] "Södra Älvsborg" "Södra Älvsborg" "Södra Älvsborg" "Storgöteborg" #> [65] "Skaraborg" "Skaraborg" NA NA #> [69] "Fyrbodal" NA "Storgöteborg" "Södra Älvsborg" #> [73] "Södra Älvsborg" NA "Fyrbodal" NA #> [77] NA "Fyrbodal" NA NA #> [81] NA NA NA NA #> [85] "Norra Halland" NA NA NA #> [89] NA NA NA "Norra Halland" #> [93] NA NA "Södra Älvsborg" "Skaraborg" #> [97] NA NA "Fyrbodal" "Fyrbodal"# The region names can be both with and without prefix: regs <- sample(6, 10, replace = TRUE) decode(regs, "region") # With prefix#> [1] "Region Syd" "Region Sthlm/Gotland" "Region Uppsala/Örebro" #> [4] "Region Sthlm/Gotland" "Region Norr" "Region Sthlm/Gotland" #> [7] "Region Norr" "Region Norr" "Region Uppsala/Örebro" #> [10] "Region Norr"decode(regs, "region", "short_region_names") # without prefix#> [1] "Syd" "Sthlm/Gotland" "Uppsala/Örebro" "Sthlm/Gotland" #> [5] "Norr" "Sthlm/Gotland" "Norr" "Norr" #> [9] "Uppsala/Örebro" "Norr"# Note that only the first four digits of the LKF-code were used abowe? # What if we use the full LKF-code? lkfs <- sample(decoder::forsamling$key, 100, replace = TRUE) decode(lkfs, "sjukvardsomrade")#> Warning: transformed to match the keyvalue: Leading 0:s are ignored. Only the first 4 characters are used.#> Warning: Some codes could not be translated (80 cells)#> [1] NA NA NA "Skaraborg" #> [5] NA NA NA "Skaraborg" #> [9] NA NA NA NA #> [13] "Sodra_Alvsborg" "Skaraborg" NA NA #> [17] NA NA NA NA #> [21] NA NA NA NA #> [25] NA NA "Fyrbodal" NA #> [29] NA NA "Skaraborg" NA #> [33] NA "Sodra_Alvsborg" NA NA #> [37] "Storgoteborg" "Fyrbodal" NA NA #> [41] NA NA "Sodra_Alvsborg" NA #> [45] "Kungalv" "Skaraborg" "Fyrbodal" NA #> [49] NA NA NA NA #> [53] NA NA NA NA #> [57] NA NA NA NA #> [61] NA NA NA NA #> [65] NA "Kungalv" NA NA #> [69] NA NA NA NA #> [73] "Norra_Halland" "Fyrbodal" NA "Storgoteborg" #> [77] NA NA "Sodra_Alvsborg" "Norra_Halland" #> [81] NA NA NA NA #> [85] NA NA NA NA #> [89] NA NA NA NA #> [93] NA "Skaraborg" NA NA #> [97] NA NA NA NA# That work's just as fine when argument exact = FALSE (which it is by default). # decode can also be used for data.frames with recognised column names d <- data.frame( kon = sample(1:2, 10, replace = TRUE), sex = sample(1:2, 10, replace = TRUE), lkf = sample(decoder::hemort$key, 10, replace = TRUE) ) decode(d)#> Warning: lkf -> lkf_kommun_beskrivning: transformed to match the keyvalue: Only the first 4 characters are used.#> Warning: lkf -> lkf_forsamling_beskrivning: Some codes could not be translated (6 cells)#> Warning: lkf -> lkf_hemort2_beskrivning: Some codes could not be translated (6 cells)#> Warning: lkf -> lkf_lan_beskrivning: transformed to match the keyvalue: Only the first 2 characters are used.#>#> #> #> #> #> #> #>#> kon sex lkf kon_kon_beskrivning lkf_kommun_beskrivning #> 1 2 2 084016 Kvinna Mörbylånga #> 2 2 1 060204 Kvinna <NA> #> 3 2 1 151802 Kvinna <NA> #> 4 2 1 088016 Kvinna Kalmar #> 5 1 2 058321 Man Motala #> 6 2 2 248210 Kvinna Skellefteå #> 7 1 2 128802 Man <NA> #> 8 2 1 098034 Kvinna Gotland #> 9 1 2 149207 Man Åmål #> 10 1 2 156622 Man <NA> #> lkf_forsamling_beskrivning lkf_hemort2_beskrivning #> 1 Norra Möckleby, Sandby och Gårdby Norra Möckleby, Sandby och Gårdby #> 2 <NA> <NA> #> 3 <NA> <NA> #> 4 Heliga Korset Heliga Korset #> 5 <NA> <NA> #> 6 Lövånger Lövånger #> 7 <NA> <NA> #> 8 Akebäck Akebäck #> 9 <NA> <NA> #> 10 <NA> <NA> #> lkf_lan_beskrivning lkf_hemort_beskrivning sex_kon_beskrivning #> 1 Kalmar län Norra möckleby Kvinna #> 2 Jönköpings län Höreda Man #> 3 <NA> Sankt peder Man #> 4 Kalmar län Heliga korset Man #> 5 Östergötlands län Hov Kvinna #> 6 Västerbottens län Lövånger Kvinna #> 7 Skåne län Falsterbo Kvinna #> 8 Gotlands län Akebäck Man #> 9 Västra Götalands län Edsleskog Kvinna #> 10 <NA> Öra Kvinna### --- code --- ### # Sometimes we have keyvalue objects with some key-value pairs without a 1:1 relation. # This is true for snomed # Show all non 1:1 pairs: summary(decoder::snomed)$nonunique#> key value #> 393 90703 Embryonalt carcinom #> 402 90723 Embryonalt carcinom #> 592 96153 Hårcellsleukemi #> 787 99403 Hårcellsleukemi #> 354 89643 Klarcellssarkom #> 381 90443 Klarcellssarkom #> 587 96073 Mantelcellslymfom #> 646 96733 Mantelcellslymfom #> 710 97423 Mastcellsleukemi #> 769 99003 Mastcellsleukemi #> 209 85103 Medullär cancer #> 215 85113 Medullär cancer #> 392 90701 Misstänkt embryonalt carcinom #> 396 90721 Misstänkt embryonalt carcinom #> 353 89641 Misstänkt klarcellssarkom #> 378 90441 Misstänkt klarcellssarkom #> 707 97333 Plasmacellsleukemi #> 741 98303 Plasmacellsleukemi #> 162 84001 Svettkörtelcancer in situ #> 163 84002 Svettkörtelcancer in situ #> 168 84101 Talgkörtelcancer in situ #> 171 84102 Talgkörtelcancer in situ #> 73 813021 Urotelialt carcinom, papillärt, icke invasivt #> 74 813022 Urotelialt carcinom, papillärt, icke invasivt #> 75 813031 Urotelialt carcinom, papillärt, invasivt #> 76 813032 Urotelialt carcinom, papillärt, invasivt #> 77 813033 Urotelialt carcinom, papillärt, invasivt #> 79 812031 Urotelialt carcinom, UNS, inklusive cytologi #> 80 812032 Urotelialt carcinom, UNS, inklusive cytologi #> 81 812033 Urotelialt carcinom, UNS, inklusive cytologi# Save them for later: non_unique_snomeds <- summary(decoder::snomed)$nonunique$key # Use these snomed codes for decoding and coding # Decoding works fine (all keys are unique) ... (a <- decode(non_unique_snomeds, "snomed"))#> [1] "Embryonalt carcinom" #> [2] "Embryonalt carcinom" #> [3] "Hårcellsleukemi" #> [4] "Hårcellsleukemi" #> [5] "Klarcellssarkom" #> [6] "Klarcellssarkom" #> [7] "Mantelcellslymfom" #> [8] "Mantelcellslymfom" #> [9] "Mastcellsleukemi" #> [10] "Mastcellsleukemi" #> [11] "Medullär cancer" #> [12] "Medullär cancer" #> [13] "Misstänkt embryonalt carcinom" #> [14] "Misstänkt embryonalt carcinom" #> [15] "Misstänkt klarcellssarkom" #> [16] "Misstänkt klarcellssarkom" #> [17] "Plasmacellsleukemi" #> [18] "Plasmacellsleukemi" #> [19] "Svettkörtelcancer in situ" #> [20] "Svettkörtelcancer in situ" #> [21] "Talgkörtelcancer in situ" #> [22] "Talgkörtelcancer in situ" #> [23] "Urotelialt carcinom, papillärt, icke invasivt" #> [24] "Urotelialt carcinom, papillärt, icke invasivt" #> [25] "Urotelialt carcinom, papillärt, invasivt" #> [26] "Urotelialt carcinom, papillärt, invasivt" #> [27] "Urotelialt carcinom, papillärt, invasivt" #> [28] "Urotelialt carcinom, UNS, inklusive cytologi" #> [29] "Urotelialt carcinom, UNS, inklusive cytologi" #> [30] "Urotelialt carcinom, UNS, inklusive cytologi"# ... but coding these values back to their key does not if (FALSE) { code(a, "snomed") }