Decode codes to plain text (and vice versa)

Translate coded values into meaningful plain text (or reversed).

code(y, keyvalue, verbose = TRUE)

decode(x, ...)

# S3 method for data.frame
decode(x, ...)

# S3 method for default
decode(x, keyvalue, extra_functions = NULL, exact = FALSE, ...)

Arguments

y	value to be coded (to be matched against the `value` element) in a keyvalue object.
keyvalue	either a name (as character string) of a package internal keyvalue object, or a user defined keyvalue object (see as.keyvalue).
verbose	(only for `code`) can be set to `FALSE` to avoid a printed message to the console if an error occur (`TRUE` as default).
x	object to decode. Either a key vector to be matched against the `key` column in `keyvalue`, or a data.frame (see section `decode.data.frame`). object.
...	ignored
extra_functions	is a list of functions (or names of functions as character vector) to be applied to the decoded data after decoding (see section "extra_functions" below).
exact	Should `x` have an exact match from the key? Default is `FALSE`. When `FALSE`, `x` might be transformed to fit the key (punctuation might be removed, upper case changed to lower case or vice versa and strings that are too long might be substringed). (`code` only accept exact matches.)

Value

For default S3 method: A vector of the same length as x but with all cells decoded (or coded) to plain text (or code) as character.
For S3 method for class 'data.frame': Data.frame x is returned, possibly with some extra columns (names ending in '_Beskrivning'), decoded from columns with names corresponding to attribute standard_var_names for keyvalue objects listed by list_keyvalues().

Vignette

See the vignette for a longer introduction to the package: vignette("decoder")

decode.data.frame

If x is a data.frame, all column names of x are matched to attribute standard_var_names for all keyvalue objects in the package (see list_keyvalues()). If the column name is a standard name used for a coding, the corresponding keyvalue object is used to decode the column and to add an extra column to x with its original name with suffix _Beskrivning. This is done for all identified columns.

extra_functions

The relationship between the key and the value in a keyvalue object is either 1:1 or m:1. The mapping is straight forward for 1:1 but with m:1, different applications might require slightly different groupings of the keys. One solution is to have several versions of the keyvalue object. Another (which we prefer) is to use the same keyvalue object (if possible) but to call one or several extra function(s) to further process the result. These functions are either built in package functions that should be called by quoted names or user defined functions that can be called by either quoted or unquoted names (if available in the current environment). Note that the order of the functions could matter since they are called in turn (the output from the first function is passed as input to the second function etc).

Standard functions and how to use them:

To use with sjukvardsomrade:

kungalv2Fyrbodal

The default classification used in sjukvardsomrade is to make Kungalv a region of its own. Use this function if Kungalv should be included in Fyrbodal. See example section below.

kungalv2Storgoteborg

As kungalv2Fyrbodal but classifies Kungalv as a part of Storgoteborg.

real_names

Give the area names with correct Swedish spelling (including spaces). This is not as default due to compatibility reasons and because names with spaces must be back-ticked when referred to.

To use with region

short_region_names

Exclude the prefix 'Region' from the region names, hence 'Syd' instead of 'Region Syd' etcetera.

Examples


KON_VALUE <- sample(1:2, 20, replace = TRUE)
(kon <- decode(KON_VALUE, decoder::kon))
#>  [1] "Man"    "Man"    "Kvinna" "Man"    "Man"    "Kvinna" "Man"    "Man"   
#>  [9] "Man"    "Kvinna" "Kvinna" "Man"    "Man"    "Man"    "Man"    "Kvinna"
#> [17] "Man"    "Man"    "Kvinna" "Kvinna"
code(kon, decoder::kon)
#>  [1] "1" "1" "2" "1" "1" "2" "1" "1" "1" "2" "2" "1" "1" "1" "1" "2" "1" "1" "2"
#> [20] "2"

# Get a sample of Snomed-codes (in the real world we obviously avoid this step) ...
snomed2 <- sample(decoder::snomed$key, 30, replace = TRUE)
# ... then decode them:
(snomed3 <- decode(snomed2, "snomed"))
#>  [1] "meningiom, blandat"                                              
#>  [2] "Trofoblastisk tumör, malignt"                                    
#>  [3] "Basaliomatös cancer"                                             
#>  [4] "Retinoblastom i spontan regress"                                 
#>  [5] "Kronisk myeloisk leukemi"                                        
#>  [6] "Jättecellstumör grad III, malign (malignt osteoklastom)"         
#>  [7] "Lentigo maligna-melanoma (infiltrerande)"                        
#>  [8] "Gangliogliom, anaplastiskt"                                      
#>  [9] "Urotelialt carcinom, papillärt, invasivt"                        
#> [10] "Osteosarkom"                                                     
#> [11] "Tekom malignt"                                                   
#> [12] "Annan lymfatisk leukemi, Akut lymfoblastleukemi av Burkitt's typ"
#> [13] "Sezary's syndrom"                                                
#> [14] "Misstänkt perifert T-cellslymfom UNS"                            
#> [15] "Misstänkt medullär cancer"                                       
#> [16] "Gamma heavy chain disease, Franklins sjukdom"                    
#> [17] "Retinoblastom i spontan regress"                                 
#> [18] "desmoplastisk infantilt astrocytom"                              
#> [19] "Urotelialt carcinom, papillärt, invasivt"                        
#> [20] "Arrhenoblastom UNS"                                              
#> [21] "Huvudcellsadenom"                                                
#> [22] "Arrenoblastom malignt"                                           
#> [23] "Germinalcellsdysplasi"                                           
#> [24] "Carcinoid misstänkt"                                             
#> [25] "Medullomyoblastom"                                               
#> [26] "Misstänkt kystadenocarcinom"                                     
#> [27] "Serös papillär/cystisk Borderlinetumör"                          
#> [28] "Misstänkt jättecellscancer"                                      
#> [29] "CB/CC follicular and diffuse"                                    
#> [30] "Urotelialt carcinom, papillärt, invasivt"                        


# Health care regions can be defined in more than one way
# By default Kungalv define a region of its own:
set.seed(123456789)
healtcare_areas_west <- sample(unlist(decoder::sjukvardsomrade), 100, replace = TRUE)
(areas <- decode(healtcare_areas_west, "sjukvardsomrade"))
#> Warning: transformed to match the keyvalue:  Punctuations are removed. Only the first 4 characters are used.
#> Warning: Some codes could not be translated (46 cells)
#>   [1] NA               NA               "Sodra_Alvsborg" "Sodra_Alvsborg"
#>   [5] "Fyrbodal"       NA               "Fyrbodal"       "Fyrbodal"      
#>   [9] "Skaraborg"      "Fyrbodal"       "Sodra_Alvsborg" "Fyrbodal"      
#>  [13] "Fyrbodal"       "Skaraborg"      "Sodra_Alvsborg" "Fyrbodal"      
#>  [17] NA               "Kungalv"        "Sodra_Alvsborg" NA              
#>  [21] NA               "Fyrbodal"       "Skaraborg"      "Fyrbodal"      
#>  [25] "Skaraborg"      NA               NA               "Skaraborg"     
#>  [29] NA               "Kungalv"        "Fyrbodal"       NA              
#>  [33] NA               NA               "Skaraborg"      "Skaraborg"     
#>  [37] NA               NA               "Norra_Halland"  NA              
#>  [41] NA               NA               NA               "Fyrbodal"      
#>  [45] "Kungalv"        NA               "Fyrbodal"       NA              
#>  [49] "Skaraborg"      "Sodra_Alvsborg" "Sodra_Alvsborg" "Skaraborg"     
#>  [53] "Skaraborg"      "Fyrbodal"       NA               NA              
#>  [57] "Fyrbodal"       NA               NA               "Fyrbodal"      
#>  [61] "Sodra_Alvsborg" "Sodra_Alvsborg" "Sodra_Alvsborg" "Storgoteborg"  
#>  [65] "Skaraborg"      "Skaraborg"      NA               NA              
#>  [69] "Fyrbodal"       NA               "Storgoteborg"   "Sodra_Alvsborg"
#>  [73] "Sodra_Alvsborg" NA               "Fyrbodal"       NA              
#>  [77] NA               "Fyrbodal"       NA               NA              
#>  [81] NA               NA               NA               NA              
#>  [85] "Norra_Halland"  NA               NA               NA              
#>  [89] NA               NA               NA               "Norra_Halland" 
#>  [93] NA               NA               "Sodra_Alvsborg" "Skaraborg"     
#>  [97] NA               NA               "Fyrbodal"       "Fyrbodal"      
table(areas)
#> areas
#>       Fyrbodal        Kungalv  Norra_Halland      Skaraborg Sodra_Alvsborg 
#>             20              3              3             13             13 
#>   Storgoteborg 
#>              2 

# But if we want Kungalv to be a part of Storgoteborg
# (which is common practice for example with lung cancer data):
(areas2 <- decode(healtcare_areas_west, "sjukvardsomrade", "kungalv2Storgoteborg"))
#> Warning: transformed to match the keyvalue:  Punctuations are removed. Only the first 4 characters are used.
#> Warning: Some codes could not be translated (46 cells)
#>   [1] NA               NA               "Sodra_Alvsborg" "Sodra_Alvsborg"
#>   [5] "Fyrbodal"       NA               "Fyrbodal"       "Fyrbodal"      
#>   [9] "Skaraborg"      "Fyrbodal"       "Sodra_Alvsborg" "Fyrbodal"      
#>  [13] "Fyrbodal"       "Skaraborg"      "Sodra_Alvsborg" "Fyrbodal"      
#>  [17] NA               "Storgoteborg"   "Sodra_Alvsborg" NA              
#>  [21] NA               "Fyrbodal"       "Skaraborg"      "Fyrbodal"      
#>  [25] "Skaraborg"      NA               NA               "Skaraborg"     
#>  [29] NA               "Storgoteborg"   "Fyrbodal"       NA              
#>  [33] NA               NA               "Skaraborg"      "Skaraborg"     
#>  [37] NA               NA               "Norra_Halland"  NA              
#>  [41] NA               NA               NA               "Fyrbodal"      
#>  [45] "Storgoteborg"   NA               "Fyrbodal"       NA              
#>  [49] "Skaraborg"      "Sodra_Alvsborg" "Sodra_Alvsborg" "Skaraborg"     
#>  [53] "Skaraborg"      "Fyrbodal"       NA               NA              
#>  [57] "Fyrbodal"       NA               NA               "Fyrbodal"      
#>  [61] "Sodra_Alvsborg" "Sodra_Alvsborg" "Sodra_Alvsborg" "Storgoteborg"  
#>  [65] "Skaraborg"      "Skaraborg"      NA               NA              
#>  [69] "Fyrbodal"       NA               "Storgoteborg"   "Sodra_Alvsborg"
#>  [73] "Sodra_Alvsborg" NA               "Fyrbodal"       NA              
#>  [77] NA               "Fyrbodal"       NA               NA              
#>  [81] NA               NA               NA               NA              
#>  [85] "Norra_Halland"  NA               NA               NA              
#>  [89] NA               NA               NA               "Norra_Halland" 
#>  [93] NA               NA               "Sodra_Alvsborg" "Skaraborg"     
#>  [97] NA               NA               "Fyrbodal"       "Fyrbodal"      
table(areas2)
#> areas2
#>       Fyrbodal  Norra_Halland      Skaraborg Sodra_Alvsborg   Storgoteborg 
#>             20              3             13             13              5 

# We can also combine several extra_functions if we for example
# also want the area names with correct Swedish spelling.
(areas3 <- decode(healtcare_areas_west, "sjukvardsomrade", c("kungalv2Storgoteborg", "real_names")))
#> Warning: transformed to match the keyvalue:  Punctuations are removed. Only the first 4 characters are used.
#> Warning: Some codes could not be translated (46 cells)
#>   [1] NA               NA               "Södra Älvsborg" "Södra Älvsborg"
#>   [5] "Fyrbodal"       NA               "Fyrbodal"       "Fyrbodal"      
#>   [9] "Skaraborg"      "Fyrbodal"       "Södra Älvsborg" "Fyrbodal"      
#>  [13] "Fyrbodal"       "Skaraborg"      "Södra Älvsborg" "Fyrbodal"      
#>  [17] NA               "Storgöteborg"   "Södra Älvsborg" NA              
#>  [21] NA               "Fyrbodal"       "Skaraborg"      "Fyrbodal"      
#>  [25] "Skaraborg"      NA               NA               "Skaraborg"     
#>  [29] NA               "Storgöteborg"   "Fyrbodal"       NA              
#>  [33] NA               NA               "Skaraborg"      "Skaraborg"     
#>  [37] NA               NA               "Norra Halland"  NA              
#>  [41] NA               NA               NA               "Fyrbodal"      
#>  [45] "Storgöteborg"   NA               "Fyrbodal"       NA              
#>  [49] "Skaraborg"      "Södra Älvsborg" "Södra Älvsborg" "Skaraborg"     
#>  [53] "Skaraborg"      "Fyrbodal"       NA               NA              
#>  [57] "Fyrbodal"       NA               NA               "Fyrbodal"      
#>  [61] "Södra Älvsborg" "Södra Älvsborg" "Södra Älvsborg" "Storgöteborg"  
#>  [65] "Skaraborg"      "Skaraborg"      NA               NA              
#>  [69] "Fyrbodal"       NA               "Storgöteborg"   "Södra Älvsborg"
#>  [73] "Södra Älvsborg" NA               "Fyrbodal"       NA              
#>  [77] NA               "Fyrbodal"       NA               NA              
#>  [81] NA               NA               NA               NA              
#>  [85] "Norra Halland"  NA               NA               NA              
#>  [89] NA               NA               NA               "Norra Halland" 
#>  [93] NA               NA               "Södra Älvsborg" "Skaraborg"     
#>  [97] NA               NA               "Fyrbodal"       "Fyrbodal"      


# The region names can be both with and without prefix:
regs <- sample(6, 10, replace = TRUE)
decode(regs, "region") # With prefix
#>  [1] "Region Syd"            "Region Sthlm/Gotland"  "Region Uppsala/Örebro"
#>  [4] "Region Sthlm/Gotland"  "Region Norr"           "Region Sthlm/Gotland" 
#>  [7] "Region Norr"           "Region Norr"           "Region Uppsala/Örebro"
#> [10] "Region Norr"          
decode(regs, "region", "short_region_names") # without prefix
#>  [1] "Syd"            "Sthlm/Gotland"  "Uppsala/Örebro" "Sthlm/Gotland" 
#>  [5] "Norr"           "Sthlm/Gotland"  "Norr"           "Norr"          
#>  [9] "Uppsala/Örebro" "Norr"          

# Note that only the first four digits of the LKF-code were used abowe?
# What if we use the full LKF-code?
lkfs <- sample(decoder::forsamling$key, 100, replace = TRUE)
decode(lkfs, "sjukvardsomrade")
#> Warning: transformed to match the keyvalue:  Leading 0:s are ignored. Only the first 4 characters are used.
#> Warning: Some codes could not be translated (80 cells)
#>   [1] NA               NA               NA               "Skaraborg"     
#>   [5] NA               NA               NA               "Skaraborg"     
#>   [9] NA               NA               NA               NA              
#>  [13] "Sodra_Alvsborg" "Skaraborg"      NA               NA              
#>  [17] NA               NA               NA               NA              
#>  [21] NA               NA               NA               NA              
#>  [25] NA               NA               "Fyrbodal"       NA              
#>  [29] NA               NA               "Skaraborg"      NA              
#>  [33] NA               "Sodra_Alvsborg" NA               NA              
#>  [37] "Storgoteborg"   "Fyrbodal"       NA               NA              
#>  [41] NA               NA               "Sodra_Alvsborg" NA              
#>  [45] "Kungalv"        "Skaraborg"      "Fyrbodal"       NA              
#>  [49] NA               NA               NA               NA              
#>  [53] NA               NA               NA               NA              
#>  [57] NA               NA               NA               NA              
#>  [61] NA               NA               NA               NA              
#>  [65] NA               "Kungalv"        NA               NA              
#>  [69] NA               NA               NA               NA              
#>  [73] "Norra_Halland"  "Fyrbodal"       NA               "Storgoteborg"  
#>  [77] NA               NA               "Sodra_Alvsborg" "Norra_Halland" 
#>  [81] NA               NA               NA               NA              
#>  [85] NA               NA               NA               NA              
#>  [89] NA               NA               NA               NA              
#>  [93] NA               "Skaraborg"      NA               NA              
#>  [97] NA               NA               NA               NA              
# That work's just as fine when argument exact = FALSE (which it is by default).

# decode can also be used for data.frames with recognised column names
d <- data.frame(
     kon = sample(1:2, 10, replace = TRUE),
     sex = sample(1:2, 10, replace = TRUE),
     lkf = sample(decoder::hemort$key, 10, replace = TRUE)
 )
 decode(d)
#> Warning: lkf -> lkf_kommun_beskrivning: transformed to match the keyvalue:  Only the first 4 characters are used.
#> Warning: lkf -> lkf_forsamling_beskrivning: Some codes could not be translated (6 cells)
#> Warning: lkf -> lkf_hemort2_beskrivning: Some codes could not be translated (6 cells)
#> Warning: lkf -> lkf_lan_beskrivning: transformed to match the keyvalue:  Only the first 2 characters are used.
#> New decoded columns added: 
#> * kon_kon_beskrivning
#> * lkf_kommun_beskrivning
#> * lkf_forsamling_beskrivning
#> * lkf_hemort2_beskrivning
#> * lkf_lan_beskrivning
#> * lkf_hemort_beskrivning
#> * sex_kon_beskrivning
#>    kon sex    lkf kon_kon_beskrivning lkf_kommun_beskrivning
#> 1    2   2 084016              Kvinna             Mörbylånga
#> 2    2   1 060204              Kvinna                   <NA>
#> 3    2   1 151802              Kvinna                   <NA>
#> 4    2   1 088016              Kvinna                 Kalmar
#> 5    1   2 058321                 Man                 Motala
#> 6    2   2 248210              Kvinna             Skellefteå
#> 7    1   2 128802                 Man                   <NA>
#> 8    2   1 098034              Kvinna                Gotland
#> 9    1   2 149207                 Man                   Åmål
#> 10   1   2 156622                 Man                   <NA>
#>           lkf_forsamling_beskrivning           lkf_hemort2_beskrivning
#> 1  Norra Möckleby, Sandby och Gårdby Norra Möckleby, Sandby och Gårdby
#> 2                               <NA>                              <NA>
#> 3                               <NA>                              <NA>
#> 4                      Heliga Korset                     Heliga Korset
#> 5                               <NA>                              <NA>
#> 6                           Lövånger                          Lövånger
#> 7                               <NA>                              <NA>
#> 8                            Akebäck                           Akebäck
#> 9                               <NA>                              <NA>
#> 10                              <NA>                              <NA>
#>     lkf_lan_beskrivning lkf_hemort_beskrivning sex_kon_beskrivning
#> 1            Kalmar län         Norra möckleby              Kvinna
#> 2        Jönköpings län                 Höreda                 Man
#> 3                  <NA>            Sankt peder                 Man
#> 4            Kalmar län          Heliga korset                 Man
#> 5     Östergötlands län                    Hov              Kvinna
#> 6     Västerbottens län               Lövånger              Kvinna
#> 7             Skåne län              Falsterbo              Kvinna
#> 8          Gotlands län                Akebäck                 Man
#> 9  Västra Götalands län              Edsleskog              Kvinna
#> 10                 <NA>                    Öra              Kvinna

### --- code --- ###
# Sometimes we have keyvalue objects with some key-value pairs without a 1:1 relation.
# This is true for snomed
# Show all non 1:1 pairs:
summary(decoder::snomed)$nonunique
#>        key                                         value
#> 393  90703                           Embryonalt carcinom
#> 402  90723                           Embryonalt carcinom
#> 592  96153                               Hårcellsleukemi
#> 787  99403                               Hårcellsleukemi
#> 354  89643                               Klarcellssarkom
#> 381  90443                               Klarcellssarkom
#> 587  96073                             Mantelcellslymfom
#> 646  96733                             Mantelcellslymfom
#> 710  97423                              Mastcellsleukemi
#> 769  99003                              Mastcellsleukemi
#> 209  85103                               Medullär cancer
#> 215  85113                               Medullär cancer
#> 392  90701                 Misstänkt embryonalt carcinom
#> 396  90721                 Misstänkt embryonalt carcinom
#> 353  89641                     Misstänkt klarcellssarkom
#> 378  90441                     Misstänkt klarcellssarkom
#> 707  97333                            Plasmacellsleukemi
#> 741  98303                            Plasmacellsleukemi
#> 162  84001                     Svettkörtelcancer in situ
#> 163  84002                     Svettkörtelcancer in situ
#> 168  84101                      Talgkörtelcancer in situ
#> 171  84102                      Talgkörtelcancer in situ
#> 73  813021 Urotelialt carcinom, papillärt, icke invasivt
#> 74  813022 Urotelialt carcinom, papillärt, icke invasivt
#> 75  813031      Urotelialt carcinom, papillärt, invasivt
#> 76  813032      Urotelialt carcinom, papillärt, invasivt
#> 77  813033      Urotelialt carcinom, papillärt, invasivt
#> 79  812031  Urotelialt carcinom, UNS, inklusive cytologi
#> 80  812032  Urotelialt carcinom, UNS, inklusive cytologi
#> 81  812033  Urotelialt carcinom, UNS, inklusive cytologi
# Save them for later:
non_unique_snomeds <- summary(decoder::snomed)$nonunique$key

# Use these snomed codes for decoding and coding
# Decoding works fine (all keys are unique) ...
(a <- decode(non_unique_snomeds, "snomed"))
#>  [1] "Embryonalt carcinom"                          
#>  [2] "Embryonalt carcinom"                          
#>  [3] "Hårcellsleukemi"                              
#>  [4] "Hårcellsleukemi"                              
#>  [5] "Klarcellssarkom"                              
#>  [6] "Klarcellssarkom"                              
#>  [7] "Mantelcellslymfom"                            
#>  [8] "Mantelcellslymfom"                            
#>  [9] "Mastcellsleukemi"                             
#> [10] "Mastcellsleukemi"                             
#> [11] "Medullär cancer"                              
#> [12] "Medullär cancer"                              
#> [13] "Misstänkt embryonalt carcinom"                
#> [14] "Misstänkt embryonalt carcinom"                
#> [15] "Misstänkt klarcellssarkom"                    
#> [16] "Misstänkt klarcellssarkom"                    
#> [17] "Plasmacellsleukemi"                           
#> [18] "Plasmacellsleukemi"                           
#> [19] "Svettkörtelcancer in situ"                    
#> [20] "Svettkörtelcancer in situ"                    
#> [21] "Talgkörtelcancer in situ"                     
#> [22] "Talgkörtelcancer in situ"                     
#> [23] "Urotelialt carcinom, papillärt, icke invasivt"
#> [24] "Urotelialt carcinom, papillärt, icke invasivt"
#> [25] "Urotelialt carcinom, papillärt, invasivt"     
#> [26] "Urotelialt carcinom, papillärt, invasivt"     
#> [27] "Urotelialt carcinom, papillärt, invasivt"     
#> [28] "Urotelialt carcinom, UNS, inklusive cytologi" 
#> [29] "Urotelialt carcinom, UNS, inklusive cytologi" 
#> [30] "Urotelialt carcinom, UNS, inklusive cytologi" 
# ... but coding these values back to their key does not
if (FALSE) {
code(a, "snomed")
}