Package 'occupationMeasurement' reference manual

Title:	Interactively Measure Occupations in Interviews and Beyond
Description:	Perform interactive occupation coding during interviews as described in Peycheva, D., Sakshaug, J., Calderwood, L. (2021) <doi:10.2478/jos-2021-0042> and Schierholz, M., Gensicke, M., Tschersich, N., Kreuter, F. (2018) <doi:10.1111/rssa.12297>. Generate suggestions for occupational categories based on free text input, with pre-trained machine learning models in German and a ready-to-use shiny application provided for quick and easy data collection.
Authors:	Jan Simson [aut, cre] , Olga Kononykhina [aut], Malte Schierholz [aut, cph]
Maintainer:	Jan Simson <[email protected]>
License:	MIT + file LICENSE
Version:	0.3.3
Built:	2025-03-14 04:30:39 UTC
Source:	https://github.com/occupationmeasurement/occupationmeasurement

Make suggestions using similarity based reasoning.

Description

The Algorithm used here corresponds to Algorithm #10 in (Schierholz, 2019). Note: This function should not be used directly, but rather as a step / algorithm in get_job_suggestions.

Usage

algo_similarity_based_reasoning(
  text_processed,
  sim_name = "wordwise",
  probabilities = occupationMeasurement::pretrained_models$similarity_based_reasoning,
  ...
)
algo_similarity_based_reasoning(
  text_processed,
  sim_name = "wordwise",
  probabilities = occupationMeasurement::pretrained_models$similarity_based_reasoning,
  ...
)

Arguments

`text_processed`	The processed user input. Will be provided by get_job_suggestions.
`sim_name`	Which similarity measure to use. Possible values are "wordwise" or "substring".
`probabilities`	Trained probabilities to be used, defaults to the one bundled with the package. See pretrained_models. This pretrained model always predicts a 5-digit code from the 2010 German Classification of Occupations, with some exceptions: -0004 stands for 'Not precise enough/uncodable', -0006 stands for 'Multiple Jobs', -0012 stands for 'Blue-collar workers', -0019 stands for 'Volunteer/Social Service', and -0030 stands for 'Student assistant'.
`...`	Additional arguments may be passed from `get_job_suggestions()`, but will be ignored in this function.

Value

A data.table with suggestions or NULL if no suggestions were found.

References

Schierholz, M. (2019). New Methods for Job and Occupation Classification (Ph.D. Thesis). University of Mannheim.

Examples



## Not run: 
# Use with default settings
if (interactive()) {
  get_job_suggestions(
    "Arzt",
    steps = list(
      simbased_default = list(
        algorithm = algo_similarity_based_reasoning
      )
    )
  )
}

# Use with substring similarity
if (interactive()) {
 get_job_suggestions(
   "Arzt",
   steps = list(
     simbased_substring = list(
       algorithm = algo_similarity_based_reasoning,
       parameters = list(
         sim_name = "substring"
       )
     )
   )
 )
}

# Comparison of algo_similarity_based_reasoning() with get_job_suggestions()

# Example of using algo_similarity_based_reasoning() directly. Not recommended.
if (interactive()) {
  algo_similarity_based_reasoning(
    preprocess_string("Arzt"),
    sim_name = "wordwise"
  )[order(score, decreasing = TRUE)]
}

# Same output as before, but the function is more adaptable.
if (interactive()) {
  get_job_suggestions(
    "Arzt",
    suggestion_type = "kldb-2010",
    num_suggestions = 1500,
    steps = list(
      simbased_default = list(
        algorithm = algo_similarity_based_reasoning,
        parameters = list(
          sim_name = "wordwise"
        )
      )
    )
  )[, list(kldb_id, score, sim_name, kldb_id_title = title)]
}

## End(Not run)
## Not run: 
# Use with default settings
if (interactive()) {
  get_job_suggestions(
    "Arzt",
    steps = list(
      simbased_default = list(
        algorithm = algo_similarity_based_reasoning
      )
    )
  )
}

# Use with substring similarity
if (interactive()) {
 get_job_suggestions(
   "Arzt",
   steps = list(
     simbased_substring = list(
       algorithm = algo_similarity_based_reasoning,
       parameters = list(
         sim_name = "substring"
       )
     )
   )
 )
}

# Comparison of algo_similarity_based_reasoning() with get_job_suggestions()

# Example of using algo_similarity_based_reasoning() directly. Not recommended.
if (interactive()) {
  algo_similarity_based_reasoning(
    preprocess_string("Arzt"),
    sim_name = "wordwise"
  )[order(score, decreasing = TRUE)]
}

# Same output as before, but the function is more adaptable.
if (interactive()) {
  get_job_suggestions(
    "Arzt",
    suggestion_type = "kldb-2010",
    num_suggestions = 1500,
    steps = list(
      simbased_default = list(
        algorithm = algo_similarity_based_reasoning,
        parameters = list(
          sim_name = "wordwise"
        )
      )
    )
  )[, list(kldb_id, score, sim_name, kldb_id_title = title)]
}

## End(Not run)

Start the occupation coding API.

Description

Start the occupation coding API.

Usage

api(
  start = TRUE,
  log_to_file = FALSE,
  file = system.file("plumber", "api", "plumber.R", package = "occupationMeasurement"),
  log_to_console = TRUE,
  log_filepath = file.path("output", "log_api.csv"),
  require_identifier = FALSE,
  allow_origin = NULL
)
api(
  start = TRUE,
  log_to_file = FALSE,
  file = system.file("plumber", "api", "plumber.R", package = "occupationMeasurement"),
  log_to_console = TRUE,
  log_filepath = file.path("output", "log_api.csv"),
  require_identifier = FALSE,
  allow_origin = NULL
)

Arguments

`start`	Whether to immediately start the api. (Defaults to TRUE)
`log_to_file`	Whether to requests should be logged in a file. Defaults to FALSE. Note: The file format used here is a CSV file for easier analysis.
`file`	Path to the `plumber.R` file describing the API. Defaults to `plumber/api/plumber.R` within the installed package. Refer to this file to understand how the API is implemented.
`log_to_console`	Whether to requests should be logged in the console. Defaults to TRUE.
`log_filepath`	The path to a CSV file in which to save the structured logs.
`require_identifier`	Whether an identifier has to be added to api requests in order to match / identify requests afterwards. Defaults to FALSE.
`allow_origin`	Domain from which to allow cross origin requests (CORS). If the API is running on a different domain / server than the application using it, the website's root has to be provided here e.g. "https://occupationMeasurement.github.io". For more information see the plumber security page, and MDN. Defaults to NULL to not set any header at all.

Value

A Plumber router

Examples



if (interactive()) {
  # Get the plumber router
  router <- api(
    start = FALSE,
    # If this is TRUE, the log directory will immediately be created
    log_to_file = FALSE
  )
  # Start the router
  plumber::pr_run(router)
}

if (interactive()) {
  # Immediately start the API
  api(start = TRUE)
}
if (interactive()) {
  # Get the plumber router
  router <- api(
    start = FALSE,
    # If this is TRUE, the log directory will immediately be created
    log_to_file = FALSE
  )
  # Start the router
  plumber::pr_run(router)
}

if (interactive()) {
  # Immediately start the API
  api(start = TRUE)
}

Get an instance of the interactive shiny occupation coding app.

Description

Printing the returned instance or returning it without saving it in a variable will start the app.

Usage

app(
  questionnaire = questionnaire_web_survey(),
  app_settings = create_app_settings(save_to_file = TRUE),
  css_file = NULL,
  resource_dir = system.file("www", package = "occupationMeasurement"),
  ...
)
app(
  questionnaire = questionnaire_web_survey(),
  app_settings = create_app_settings(save_to_file = TRUE),
  css_file = NULL,
  resource_dir = system.file("www", package = "occupationMeasurement"),
  ...
)

Arguments

`questionnaire`	The questionnaire to load. (Defaults to the questionnaire returned by questionnaire_web_survey().)
`app_settings`	The app_settings to use. Check the documentation for create_app_settings to learn about the options.
`css_file`	Path to a CSS file to be included in the app.
`resource_dir`	From which directory to static files e.g. styles. If you want to load additional resources from outside the package, you should rather do so with shiny::addResourcePath rather than with this parameter.
`...`	Any additional parameters will be forwarded to shiny::shinyApp().

Value

A shiny app instance.

Examples



## Not run: 
app_instance <- app(
  app_settings = create_app_settings(
    # Important to save results from the app
    save_to_file = TRUE
  )
)

# Start the app
if (interactive()) {
  app_instance
}

## End(Not run)
## Not run: 
app_instance <- app(
  app_settings = create_app_settings(
    # Important to save results from the app
    save_to_file = TRUE
  )
)

# Start the app
if (interactive()) {
  app_instance
}

## End(Not run)

German Auxiliary Classification of Occupations (AuxCO) v1.2.3

Description

Berufs-Hilfsklassifikation mit Tätigkeitsbeschreibungen.

Usage

auxco
auxco

Format

A list with data.tables:

categories: data.table. Main list of AuxCO categories including their descriptions etc.
distinctions: data.table. List of highly similar AuxCO categories that one may want to present to disambiguate between them.
followup_questions: data.table. Follow-up questions to specify final codings based on AuxCO categories. Includes the questions' answer options as well as information on how to encode more complex occupations which depend on multiple answers.
mapping_from_isco: data.table. Mapping from ISCO-08 categories to AuxCO categories.
mapping_from_kldb: data.table. Mapping from KldB 2010 categories to AuxCO categories.

References

Schierholz, Malte; Brenner, Lorraine; Cohausz, Lea; Damminger, Lisa; Fast, Lisa; Hörig, Ann-Kathrin; Huber, Anna-Lena; Ludwig, Theresa; Petry, Annabell; Tschischka, Laura (2018): Vorstellung einer Hilfsklassifikation mit Tätigkeitsbeschreibungen für Zwecke der Berufskodierung. (IAB-Discussion Paper, 2018), Nürnberg, 45 S. https://www.iab.de/183/section.aspx/Publikation/k180509301

Go to the next page

Description

Buttons to navigate between pages.

Usage

button_next(label = "Weiter")

button_previous(label = "Zurück")
button_next(label = "Weiter")

button_previous(label = "Zurück")

Arguments

label

What label the button should have.

Value

shiny Action Button

Functions

button_previous(): Go to the previous page

Examples



## Not run: 
very_simple_page <- new_page(
  page_id = "example",
  render = function(session, run_before_output, input, output, ...) {
    list(
      shiny::tags$h1("My test page"),
      button_previous(),
      button_next()
    )
  }
)

## End(Not run)
## Not run: 
very_simple_page <- new_page(
  page_id = "example",
  render = function(session, run_before_output, input, output, ...) {
    list(
      shiny::tags$h1("My test page"),
      button_previous(),
      button_next()
    )
  }
)

## End(Not run)

Create app_settings.

Description

This is the primary and most convenient way of configuring the app.

Usage

create_app_settings(
  save_to_file,
  suggestion_type = "auxco-1.2.x",
  default_num_suggestions = 5,
  require_respondent_id = FALSE,
  warn_before_leaving = FALSE,
  skip_followup_types = c(),
  response_output_dir = file.path("output", "responses"),
  handle_data = NULL,
  get_job_suggestion_params = NULL,
  display_page_ids = TRUE,
  default_tense = "present",
  default_extra_instructions = "on",
  verbose = TRUE,
  .validate = TRUE
)
create_app_settings(
  save_to_file,
  suggestion_type = "auxco-1.2.x",
  default_num_suggestions = 5,
  require_respondent_id = FALSE,
  warn_before_leaving = FALSE,
  skip_followup_types = c(),
  response_output_dir = file.path("output", "responses"),
  handle_data = NULL,
  get_job_suggestion_params = NULL,
  display_page_ids = TRUE,
  default_tense = "present",
  default_extra_instructions = "on",
  verbose = TRUE,
  .validate = TRUE
)

Arguments

`save_to_file`	Should responses be saved as files in response_output_dir? Defaults to use the SAVE_TO_FILE environment variable. We recommend setting this to TRUE.
`suggestion_type`	Which type of suggestion to use / provide. Possible options are "auxco-1.2.x" and "kldb-2010".
`default_num_suggestions`	The number of suggestions to generate and display to users. Accepts all positive integers. Defaults to 5.
`require_respondent_id`	Are respondent_ids required? Defaults to FALSE
`warn_before_leaving`	Should users be warned that their progress will be lost upon leaving the site? Defaults to FALSE.
`skip_followup_types`	A vector of strings corresponding to the question_type of followup_question that should be skipped. Allowed ' values: c("anforderungsniveau", "aufsicht", "spezialisierung", "sonstige")
`response_output_dir`	Path to the directory in which to store data from the app. Defaults to `⁠./output/responses/⁠`.
`handle_data`	Callback function to handle data from the app. This setting takes a function that get's passed 3 parameters: table_name (A reference name indicating which data to save), data (A dataframe of data to save), session (the user's current session).
`get_job_suggestion_params`	List of parameters to pass to get_job_suggestion. Refer to `get_job_suggestions()` for a list of supported parameters. Note that the parameter aggregate_score_threshold needs to be set on `page_first_freetext()` or `page_second_freetext()`.
`display_page_ids`	Whether `page_ids` should be displayed within the questionnaires.
`default_tense`	We may not always want to ask for the current occupation, but maybe also for the previous occupation in case of pensioners etc. with a value of "past". Possible values are "present" (default), "past". This setting can be overwritten on a session-by-session basis with the URL-Query parameter "tense".
`default_extra_instructions`	Display additional instructions for e.g. an interviewer conducting an interview. Possible values are "on" (default), "off". This setting can be overwritten on a session-by-session basis with the URL-Query parameter "extra_instructions".
`verbose`	Should additional output be printed when running? Defaults to TRUE.
`.validate`	Whether the created app_settings should be validated. Defaults to TRUE.

Value

A list of app_settings.

Examples



app_settings <- create_app_settings(
  # Important to save results from the app
  save_to_file = TRUE,
  require_respondent_id = TRUE
)
app_settings <- create_app_settings(
  # Important to save results from the app
  save_to_file = TRUE,
  require_respondent_id = TRUE
)

Get the final occupation codes

Description

The final occupation code will depend on the suggestion_id and, possibly, on followup_answers, depending on the suggestion_id provided. See occupationMeasurement::auxco$followup_questions for a list of suggestion_ids (=auxco_id) and their respective recommended follow-up questions.

Usage

get_final_codes(
  suggestion_id,
  followup_answers = list(),
  standardized_answer_levels = NULL,
  approximate_standardized_answer_levels = TRUE,
  code_type = c("isco_08", "kldb_10"),
  verbose = TRUE,
  suggestion_type = "auxco-1.2.x",
  suggestion_type_options = list()
)
get_final_codes(
  suggestion_id,
  followup_answers = list(),
  standardized_answer_levels = NULL,
  approximate_standardized_answer_levels = TRUE,
  code_type = c("isco_08", "kldb_10"),
  verbose = TRUE,
  suggestion_type = "auxco-1.2.x",
  suggestion_type_options = list()
)

Arguments

`suggestion_id`	Id of the suggestion
`followup_answers`	A named list of the question_ids with their respective answers to the followup_questions. Question ids correspond to list names, answers correspond to list values.
`standardized_answer_levels`	A named list of standardized isco answer levels. Names in the list correspond to the type of isco standard, values correspond to the level itself. Possible standardized answer types are: "isco_skill_level" and "isco_supervisor_manager". These can be used instead of some followup questions if the information is available already from a different source. Please note that standardized answer levels are not available for all question types. For a list of options please take a look at the followup questions included in the auxco for example via `occupationMeasurement::auxco$followup_questions`.
`approximate_standardized_answer_levels`	(default TRUE) Follow up questions were designed to provide answer options that are not in conflict with suggestion_id. standardized_answer_levels can be in conflict with suggestion_id, and then no exact matches exist. With approximation, the answer option that is closest to the standardized_answer_levels provided, will be used.
`code_type`	Which type of codes should be returned. Multiple codes can be returned at the same time. Supported types of codes are "isco_08" and "kldb_10". Defaults to "isco_08" and "kldb_10".
`verbose`	(default TRUE) whether to return a `message` or not, detailing potential issues with the input provided.
`suggestion_type`	Which suggestion type is being used. Only auxco-based suggestion_types are supported.
`suggestion_type_options`	A list with options for generating suggestions. Supported options: - `datasets`: Pass specific datasets to be used whenn adding information to predictions e.g. use a specific version of the kldb or auxco. Supported datasets are: "auxco-1.2.x", "kldb-2010". By default the datasets bundled with this package are used.

Details

The interview situation may not allow to ask these follow-up questions. Some default, but suboptimal occupation code is returned if followup_answers is missing.

If followup_answers is missing or incomplete, one may wish to insert/infer the missing information by using standardized_answer_levels.

Value

A named list corresponding to the code_type(s) specified. Includes a message if verbose = TRUE

Examples



## Not run: 
get_final_codes(
  # Führungsaufgaben mit Personalverantwortung  bei der Lebensmittelherstellung
  "9076",
  followup_answers = list(
    # The first answer option in the first followup question
    "Q9076_1" = 2
  )
)

# The same, but using standardized answer levels
get_final_codes(
  # Führungsaufgaben mit Personalverantwortung  bei der Lebensmittelherstellung
  "9076",
  standardized_answer_levels = list(
    # A response corresponding to the standard ISCO Level "supervisor"
    "isco_supervisor_manager" = "isco_supervisor"
  )
)

# Same example with approximate matching, due to conflicting information:
# External data suggest the person is not a supervisor, but the person still
# says she does supervisory tasks (Führungsaufgaben, as encoded in "9076").
# If approximate_standardized_answer_levels = TRUE (the default), the
# selected answer "9076" trumps the external data and we will code this
# person as a supervisor.
get_final_codes(
  # Führungsaufgaben mit Personalverantwortung  bei der Lebensmittelherstellung
  "9076",
  standardized_answer_levels = list(
    # A response corresponding to the standard ISCO Level "not manager nor supervisor"
    "isco_supervisor_manager" = "isco_not_supervising"
  )
)

## End(Not run)
## Not run: 
get_final_codes(
  # Führungsaufgaben mit Personalverantwortung  bei der Lebensmittelherstellung
  "9076",
  followup_answers = list(
    # The first answer option in the first followup question
    "Q9076_1" = 2
  )
)

# The same, but using standardized answer levels
get_final_codes(
  # Führungsaufgaben mit Personalverantwortung  bei der Lebensmittelherstellung
  "9076",
  standardized_answer_levels = list(
    # A response corresponding to the standard ISCO Level "supervisor"
    "isco_supervisor_manager" = "isco_supervisor"
  )
)

# Same example with approximate matching, due to conflicting information:
# External data suggest the person is not a supervisor, but the person still
# says she does supervisory tasks (Führungsaufgaben, as encoded in "9076").
# If approximate_standardized_answer_levels = TRUE (the default), the
# selected answer "9076" trumps the external data and we will code this
# person as a supervisor.
get_final_codes(
  # Führungsaufgaben mit Personalverantwortung  bei der Lebensmittelherstellung
  "9076",
  standardized_answer_levels = list(
    # A response corresponding to the standard ISCO Level "not manager nor supervisor"
    "isco_supervisor_manager" = "isco_not_supervising"
  )
)

## End(Not run)

Get potential follow-up questions for a suggestion.

Description

Get potential follow-up questions for a suggestion.

Usage

get_followup_questions(
  suggestion_id,
  tense = "present",
  suggestion_type = "auxco-1.2.x",
  suggestion_type_options = list(),
  include_answer_codes = FALSE
)
get_followup_questions(
  suggestion_id,
  tense = "present",
  suggestion_type = "auxco-1.2.x",
  suggestion_type_options = list(),
  include_answer_codes = FALSE
)

Arguments

`suggestion_id`	Id of the suggestion
`tense`	Which tense i.e. time to use for questions & answers, this can be "present" or "past". Defaults to "present".
`suggestion_type`	Which suggestion type is being used. Only auxco-based suggestion_types are supported.
`suggestion_type_options`	A list with options for generating suggestions. Supported options: - `datasets`: Pass specific datasets to be used whenn adding information to predictions e.g. use a specific version of the kldb or auxco. Supported datasets are: "auxco-1.2.x", "kldb-2010". By default the datasets bundled with this package are used.
`include_answer_codes`	Whether answer options should contain information on the associated codes. Defaults to FALSE. (Only for internal use, use `get_final_codes()` to get codes)

Value

List of followup questions and their answer options.

Examples



## Not run: 
# Get followup questions for "Post- und Zustelldienste"
get_followup_questions("1004")

## End(Not run)
## Not run: 
# Get followup questions for "Post- und Zustelldienste"
get_followup_questions("1004")

## End(Not run)

Retrieve data for an item.

Description

Each page in the questionnaire can have multiple items on it.

Usage

get_item_data(
  session,
  page_id,
  item_id = NULL,
  key = c("all", "question_text", "response_text", "response_id"),
  default = NULL
)
get_item_data(
  session,
  page_id,
  item_id = NULL,
  key = c("all", "question_text", "response_text", "response_id"),
  default = NULL
)

Arguments

`session`	The shiny session
`page_id`	The page for which to retrieve data.
`item_id`	The item for which to retrieve data. This has to be different for different items on the same page. Since most pages contain only a single question/item, `item_id` is set to "default" if missing.
`key`	The key for which to retrieve a value. (Optional) If no key is provided, the items's whole data will be returned.
`default`	A default value to return if the key or page is not present in the questionnaire data.

Value

The items's data.

Examples



## Not run: 
# Set up a "fake" shiny session to store data
session <- shiny::MockShinySession$new()
session$userData <- list(
  current_page_id = "other_page",
  questionnaire_data = list(
    example_page = list()
  )
)

# This code is expected to be run in e.g. run_before or run_after
# It doesn't really make sense to run this code outside
set_item_data(
  session = session,
  page_id = "example_page",
  question_text = "How are you?"
)

# This code is expected to be run in e.g. run_before
get_item_data(
  session = session,
  page_id = "example_page"
)

## End(Not run)
## Not run: 
# Set up a "fake" shiny session to store data
session <- shiny::MockShinySession$new()
session$userData <- list(
  current_page_id = "other_page",
  questionnaire_data = list(
    example_page = list()
  )
)

# This code is expected to be run in e.g. run_before or run_after
# It doesn't really make sense to run this code outside
set_item_data(
  session = session,
  page_id = "example_page",
  question_text = "How are you?"
)

# This code is expected to be run in e.g. run_before
get_item_data(
  session = session,
  page_id = "example_page"
)

## End(Not run)

Make coding suggestions based on a user's open-ended text input.

Description

Given a text input, find up to num_suggestions possible occupation categories.

Usage

get_job_suggestions(
  text,
  suggestion_type = "auxco-1.2.x",
  num_suggestions = 5,
  suggestion_type_options = list(),
  aggregate_score_threshold = 0.02,
  item_score_threshold = 0,
  distinctions = TRUE,
  steps = list(simbased_wordwise = list(algorithm = algo_similarity_based_reasoning,
    parameters = list(sim_name = "wordwise")), simbased_substring = list(algorithm =
    algo_similarity_based_reasoning, parameters = list(sim_name = "substring"))),
  include_general_id = FALSE
)
get_job_suggestions(
  text,
  suggestion_type = "auxco-1.2.x",
  num_suggestions = 5,
  suggestion_type_options = list(),
  aggregate_score_threshold = 0.02,
  item_score_threshold = 0,
  distinctions = TRUE,
  steps = list(simbased_wordwise = list(algorithm = algo_similarity_based_reasoning,
    parameters = list(sim_name = "wordwise")), simbased_substring = list(algorithm =
    algo_similarity_based_reasoning, parameters = list(sim_name = "substring"))),
  include_general_id = FALSE
)

Arguments

`text`	The raw text input from the user.
`suggestion_type`	Which type of suggestion to use / provide. Possible options are "auxco-1.2.x" and "kldb-2010".
`num_suggestions`	The maximum number of suggestions to show. This is an upper bound and less suggestions may be returned. Defaults to 5.
`suggestion_type_options`	A list with options for generating suggestions. Supported options: - `datasets`: Pass specific datasets to be used whenn adding information to predictions e.g. use a specific version of the kldb or auxco. Supported datasets are: "auxco-1.2.x", "kldb-2010". By default the datasets bundled with this package are used.
`aggregate_score_threshold`	A single value or named list of thresholds between 0 and 1. If it is a list, each entry should correspond to one of the `steps`. If it is a single value, it will apply to all steps. Results from that step will only be returned if the sum of their scores is equal to or greater than the specified threshold. With a aggregate_score_threshold of 0 results will always be returned (if there are any).
`item_score_threshold`	A threshold between 0 and 1 (usually very small, default 0). Results from any step will only be returned if they are greater than the specified threshold. Allows the removal of highly implausible suggestions.
`distinctions`	Whether or not to add additional distinctions to similar occupational categories to the source code. Defaults to TRUE.
`steps`	A list with the algorithms to use and their parameters. Each entry of the list should contain a nested list with two entries: algorithm (the algorithm's function itself) and parameters (the parameters to pass onto the algorithm). Each algorithm will also always have access to a default set of three parameters: text_processed: The input text after preprocessing suggestion_type: Which type of suggestion to output num_suggestions: How many suggestions shall be returned These parameters must not be specified manually and will be provided automatically instead. Defaults to: list( # try similarity "one word at most 1 letter different" first list( algorithm = algo_similarity_based_reasoning, parameters = list( sim_name = "wordwise", min_aggregate_prob = 0.535 ) ), # since everything else failed, try "substring" similarity list( algorithm = algo_similarity_based_reasoning, parameters = list( sim_name = "substring", min_aggregate_prob = 0.02 ) ) )
`include_general_id`	Whether a general column, called "id" should always be returned. This will automatically contain the appropriate id for different suggestion_types i.e. for "auxco-1-2.x" it will contain the same data as the column "auxco_id".

Details

The procedure implemented here is, roughly speaking, as follows:

Predict categories from KldB 2010, including their scores. The first algorithm mentioned in steps is used (default: algo_similarity_based_reasoning()).
Convert the predicted KldB 2010 categories to suggestion_type (default: ⁠auxco-1.2.x⁠, an n:m mapping, scores are mapped accordingly.). See internal function convert_suggestions() for details.
Remove predicted categories if their score is below item_score_threshold and only keep the num_suggestions top-ranked suggestions.
Start anew, trying the next algorithm in steps, if the the top-ranked suggestions have a low chance to be correct. (Technically, this happens if the summed score of the num_suggestions top-ranked suggestions is below aggregate_score_threshold.)
If suggestion_type == "auxco-1.2.x" and distinctions == TRUE, insert additional and (highly) similar categories or replace existing ones. See internal function add_distinctions_auxco(). Reorder and keep only the num_suggestions top-ranked suggestions. Auxco categories which were added during this step can be identified by their scores: It equals 0.05 for categories with high similarity and 0.005 for categories with medium similarity.

Value

A data.table with suggestions or NULL if no suggestions were found.

Examples



## Not run: 
if (interactive()) {
  get_job_suggestions("Koch")
}

if (interactive()) {
  get_job_suggestions("Schlosser")
}

## End(Not run)
## Not run: 
if (interactive()) {
  get_job_suggestions("Koch")
}

if (interactive()) {
  get_job_suggestions("Schlosser")
}

## End(Not run)

Convenience function to aggregate all saved results_overview files.

Description

Expects data to be saved as files.

Usage

get_responses(app_settings = create_app_settings(save_to_file = TRUE))
get_responses(app_settings = create_app_settings(save_to_file = TRUE))

Arguments

app_settings

The app_settings configuration, should be the same as used in app().

Value

A combined data.table of user data (based on results_overview) or NULL if there are no files.

Examples



## Not run: 
app_settings <- create_app_settings(save_to_file = TRUE)
if (interactive()) {
  get_responses(app_settings = app_settings)
}

## End(Not run)
## Not run: 
app_settings <- create_app_settings(save_to_file = TRUE)
if (interactive()) {
  get_responses(app_settings = app_settings)
}

## End(Not run)

Categories of the The International Standard Classification of Occupations - ISCO-08

Description

Categories from the International Standard Classification of Occupations - ISCO-08. ISCO-08 is a hierarchical classification, consisting of 10 (1-digit) major groups, 43 (2-digit) sub-major groups, 130 (3-digit) minor groups, and 436 (4-digit) unit groups, all of them included in this data set.

Usage

isco_08_en
isco_08_en

Format

A data frame with 619 rows and 3 variables:

code: character. Unique ISCO-08 identifier / code.
label: character. Short label / title for the category.
description: character. Detailed description of the category.

Details

Source: https://esco.ec.europa.eu This service uses the ESCO classification of the European Commission. The descriptions used here are taken from the ESCO classification (v1.1, Occupations pillar) of the European Commission, which is based on ISCO-08.

More information on the ISCO-08: https://isco-ilo.netlify.app/en/isco-08/, https://www.ilo.org/public/english/bureau/stat/isco/isco08/

Load AuxCO from a directory of CSV files

Description

This function loads the Auxiliary Classification of Occupations (AuxCO) by reading CSVs from the specified directory, while loading e.g. ids in the correct format. Data is loaded into a named list matching the format expected by other functions in this package.

Usage

load_auxco(dir, add_explanations = TRUE)
load_auxco(dir, add_explanations = TRUE)

Arguments

`dir`	The path to the directory which holds the CSVs.
`add_explanations`	Whether explanations should be added to some of the harder to understand task descriptions. Defaults to TRUE.

Details

This package also includes an already loaded version of the auxco, which can be used straight away without calling this function.

Value

A list with multiple data.tables.

Examples



## Not run: 
# This function expects the CSV files from
# https://github.com/occupationMeasurement/auxiliary-classification/releases/
# to be there.
path_to_auxco <- "auxco"
if (dir.exists(path_to_auxco)) {
  load_auxco(path_to_auxco)
}

## End(Not run)
## Not run: 
# This function expects the CSV files from
# https://github.com/occupationMeasurement/auxiliary-classification/releases/
# to be there.
path_to_auxco <- "auxco"
if (dir.exists(path_to_auxco)) {
  load_auxco(path_to_auxco)
}

## End(Not run)

Clean & Load KldB 2010 dataset.

Description

Use load_kldb_raw() to load the whole dataset.

Usage

load_kldb_raw(
  cache_dir = getOption("occupationMeasurement.cache_dir", tempdir())
)

load_kldb(cache_dir = getOption("occupationMeasurement.cache_dir", tempdir()))
load_kldb_raw(
  cache_dir = getOption("occupationMeasurement.cache_dir", tempdir())
)

load_kldb(cache_dir = getOption("occupationMeasurement.cache_dir", tempdir()))

Arguments

cache_dir

The path to the directory where the downloaded data should be stored. We recommend setting this to "cache" to store data in the working directory. This will prevent reloading the data time and time again. This can be set globally via options(occupationMeasurement.cache_dir = "cache").

Details

Source: https://www.klassifikationsserver.de/klassService/index.jsp?variant=kldb2010

More information on the KldB 2010: https://statistik.arbeitsagentur.de/DE/Navigation/Grundlagen/Klassifikationen/Klassifikation-der-Berufe/KldB2010-Fassung2020/KldB2010-Fassung2020-Nav.html The KldB 2010 has been revised in 2020. These changes have not been implemented here yet.

Value

A cleaned / slimmed version of the KldB 2010.

Functions

load_kldb_raw(): Load raw KldB 2010 dataset.

Examples



## Not run: 
# We recommend using a non-temporary directory for caching, so data is
# downloaded only once and not time and time again
cache_dir <- tempdir()
# Note: The dataset will be downloaded from the internet
# Load the cleaned dataset
load_kldb(cache_dir = cache_dir)
# Load the raw dataset
load_kldb_raw(cache_dir = cache_dir)

## End(Not run)
## Not run: 
# We recommend using a non-temporary directory for caching, so data is
# downloaded only once and not time and time again
cache_dir <- tempdir()
# Note: The dataset will be downloaded from the internet
# Load the cleaned dataset
load_kldb(cache_dir = cache_dir)
# Load the raw dataset
load_kldb_raw(cache_dir = cache_dir)

## End(Not run)

Create a new questionnaire page.

Description

Each page corresponds to a page within the app/questionnaire.

Usage

new_page(
  page_id,
  render,
  condition = NULL,
  run_before = NULL,
  render_before = NULL,
  render_after = NULL,
  run_after = NULL
)
new_page(
  page_id,
  render,
  condition = NULL,
  run_before = NULL,
  render_before = NULL,
  render_after = NULL,
  run_after = NULL
)

Arguments

`page_id`	A unique string identifiying this page. (Required) This will be used to store data.
`render`	Function to render the page. (Required) It is expected, that the function returns a list of shiny tags. Its output will be combined with `render_before` and `render_after.` This function has access to the shiny `session` and the `run_before_output.`
`condition`	Function to check whether the page should be shown. When this function returns `TRUE`, the page will be shown upon navigating there, if it returns `FALSE` it will be skipped. Defaults to show the page. This function has access to the shiny `session`.
`run_before`	Function that prepares data to render the page. Called immediately after condition (if `condition` returned `TRUE`). Whatever run_before returns is available in render, render_before and render_after as `run_before_output`. This function has access to the shiny `session`.
`render_before`	Called exactly like `render`. Output will be added just before the output from render. Mainly used to add additional outputs to existing pages.
`render_after`	Called exactly like `render`. Output will be added just after the output from render. Mainly used to add additional outputs to existing pages.
`run_after`	Function that handles the user input when they leave the page. This function has access to the shiny session and shiny input.

Details

Pages are rendered by calling the different life-cycle functions one after another. The order in which they are called is as follows:

condition (session) Only if this evaluated to TRUE, continue.
run_before (session)
render_before (session, run_before_output, input, output)
render (session, run_before_output, input, output)
render_after (session, run_before_output, input, output) The outputs from render_before, render & render_after are stitched together to produce the final HTML output of the page.
run_after (session, input, output) Run when the user leaves the page (=clicks the next button). Any user input has to be handled here. For each question asked, one will typically call set_item_data() to save the collected data internally.

Each of the life-cycle functions above is annotated with the paramaters it has access to. session, input and output are passed directly from shiny and correspond to the objects made available by shiny::shinyServer(), run_before_output is available for convenience and corresponds to whatever is returned by run_before.

Some side-effects occur:

occupationMeasurement:::init_page_data is called before 1. run_before. It sets up an internal representation of the page data to be saved.
occupationMeasurement:::finalize_data is called before 6. run_before.
occupationMeasurement:::save_page_data is called after 6. run_before. It saves the responses on a hard drive, i.e. it appends the responses from this page to table_name == "answers". See the vignette and create_app_settings() for details.

Use of render_before, render_after is discouraged if not necessary, as these two life-cycle functions have only been added to allow for easier modification and extension of existing pages.

Value

A new page object.

Examples



## Not run: 
very_simple_page <- new_page(
  page_id = "example",
  render = function(session, run_before_output, input, output, ...) {
    list(
      shiny::tags$h1("My test page"),
      button_previous(),
      button_next()
    )
  }
)

# Example where we also save some data
page_that_saves_two_items <- new_page(
  page_id = "questions_1_and_2",
  render = function(session, run_before_output, page, input, output, ...) {
    list(
      shiny::tags$h1("Questions"),
      shiny::textAreaInput(
        inputId = "day_freetext",
        label = "How was your day? Please give a detailed answer.",
        value = get_item_data(
          session = session, page_id = page$page_id,
          item_id = "day_freetext",
          key = "response_text"
        )
      ),
      shiny::tags$p("How would you rate your day? On a scale of 1 - 4"),
      radioButtons(
        inputId = "day_radio",
        label = NULL,
        width = "100%",
        choices = list(One = 1, Two = 2, Three = 3, Four = 4),
        selected = get_item_data(
          session = session,
          page_id = page$page_id,
          item_id = "day_radio",
          key = "response_id",
          default = character(0)
        )
      ),
      button_previous(),
      button_next()
    )
  },
  run_after = function(session, page, input, ...) {
    set_item_data(
      session = session,
      page_id = page$page_id,
      item_id = "day_freetext",
      response_text = input$day_freetext
    )
    set_item_data(
      session = session,
      page_id = page$page_id,
      item_id = "day_radio",
      response_id = input$day_radio
    )
  }
)

questionnaire_that_saves_two_items <- list(
  page_that_saves_two_items,
  # So we have a next page to go to
  very_simple_page
)

if (interactive()) {
  app(questionnaire = questionnaire_that_saves_two_items)
}

## End(Not run)
## Not run: 
very_simple_page <- new_page(
  page_id = "example",
  render = function(session, run_before_output, input, output, ...) {
    list(
      shiny::tags$h1("My test page"),
      button_previous(),
      button_next()
    )
  }
)

# Example where we also save some data
page_that_saves_two_items <- new_page(
  page_id = "questions_1_and_2",
  render = function(session, run_before_output, page, input, output, ...) {
    list(
      shiny::tags$h1("Questions"),
      shiny::textAreaInput(
        inputId = "day_freetext",
        label = "How was your day? Please give a detailed answer.",
        value = get_item_data(
          session = session, page_id = page$page_id,
          item_id = "day_freetext",
          key = "response_text"
        )
      ),
      shiny::tags$p("How would you rate your day? On a scale of 1 - 4"),
      radioButtons(
        inputId = "day_radio",
        label = NULL,
        width = "100%",
        choices = list(One = 1, Two = 2, Three = 3, Four = 4),
        selected = get_item_data(
          session = session,
          page_id = page$page_id,
          item_id = "day_radio",
          key = "response_id",
          default = character(0)
        )
      ),
      button_previous(),
      button_next()
    )
  },
  run_after = function(session, page, input, ...) {
    set_item_data(
      session = session,
      page_id = page$page_id,
      item_id = "day_freetext",
      response_text = input$day_freetext
    )
    set_item_data(
      session = session,
      page_id = page$page_id,
      item_id = "day_radio",
      response_id = input$day_radio
    )
  }
)

questionnaire_that_saves_two_items <- list(
  page_that_saves_two_items,
  # So we have a next page to go to
  very_simple_page
)

if (interactive()) {
  app(questionnaire = questionnaire_that_saves_two_items)
}

## End(Not run)

Show a page with multiple radio button options where once can be picked.

Description

Show a page with multiple radio button options where once can be picked.

Usage

page_choose_one_option(
  page_id,
  question_text = "Please pick one of the following options",
  list_of_choices = list(One = 1, Two = 2, Three = 3),
  choice_labels = NULL,
  next_button = TRUE,
  previous_button = TRUE,
  run_before = NULL,
  run_after = NULL,
  ...
)
page_choose_one_option(
  page_id,
  question_text = "Please pick one of the following options",
  list_of_choices = list(One = 1, Two = 2, Three = 3),
  choice_labels = NULL,
  next_button = TRUE,
  previous_button = TRUE,
  run_before = NULL,
  run_after = NULL,
  ...
)

Arguments

`page_id`	A unique string identifiying this page. Used to store data.
`question_text`	The question / text to display. This can be either a string, which will simply be displayed or a function to dynamically determine the question_text.
`list_of_choices`	A list of answering options. This can either be just a simple list of values or a named list with the names corresponding to what the user sees and the values corresponding to the actually saved values. e.g. with `list(One = 1, Two = 2, Three = 3)` people will see One, Two, ... and numbers 1, 2, ... will be saved under `response_id`. If you want to use more complex choice names than jsut strings (i.e. HTML), you can also use the choice_labels option for that.
`choice_labels`	List or vector of only the choice names to be shown. This has to be matched by an equal-length vector in list_of_choices.
`next_button`	Whether to show the button to navigate to the next page? Defaults to TRUE.
`previous_button`	Whether to show the button to navigate to the preivous page? Defaults to TRUE.
`run_before`	Similar to `run_before` in `new_page()`, passed explicitly here as this page adds some of its own code to `run_before`.
`run_after`	Similar to `run_after` in `new_page()`, passed explicitly here as this page adds some of its own code to `run_after`.
`...`	Other parametrs are passed on to `new_page()`

Value

A page object.

Examples



## Not run: 
one_page_questionnaire <- list(
  page_choose_one_option(
    "test_page_radio",
    question_text = "Hello there! Please pick your favorite number from the options below:",
    list_of_choices = list(One = 1, Two = 2, Three = 3)
  ),
  page_final()
)
if (interactive()) {
  app(questionnaire = one_page_questionnaire)
}

## End(Not run)
## Not run: 
one_page_questionnaire <- list(
  page_choose_one_option(
    "test_page_radio",
    question_text = "Hello there! Please pick your favorite number from the options below:",
    list_of_choices = list(One = 1, Two = 2, Three = 3)
  ),
  page_final()
)
if (interactive()) {
  app(questionnaire = one_page_questionnaire)
}

## End(Not run)

Page to receive feedback on how well the chosen suggestion fits

Description

Page to receive feedback on how well the chosen suggestion fits

Usage

page_feedback(is_interview = FALSE, ...)
page_feedback(is_interview = FALSE, ...)

Arguments

`is_interview`	Should the page show slightly different / additional instructions and answer options for an interview that is conducted by another person? Defaults to FALSE.
`...`	All additional parameters are passed first passed on to `page_choose_one_option()` and then `new_page()`.

Value

A page object.

Examples



## Not run: 
my_questionnaire <- list(
  page_first_freetext(),
  page_select_suggestion(),
  page_feedback()
)
if (interactive()) {
  app(questionnaire = my_questionnaire)
}

## End(Not run)
## Not run: 
my_questionnaire <- list(
  page_first_freetext(),
  page_select_suggestion(),
  page_feedback()
)
if (interactive()) {
  app(questionnaire = my_questionnaire)
}

## End(Not run)

A final page, showing instructions to close the window.

Description

This page saves data in results_overview and marks the questionnaire as complete.

Usage

page_final(...)
page_final(...)

Arguments

...

All additional parameters are passed to new_page()

Value

A page object.

Examples



## Not run: 
my_questionnaire <- list(
  page_final()
)
if (interactive()) {
  app(questionnaire = my_questionnaire)
}

## End(Not run)
## Not run: 
my_questionnaire <- list(
  page_final()
)
if (interactive()) {
  app(questionnaire = my_questionnaire)
}

## End(Not run)

The first freetext question to show.

Description

Here, the description of the job can be entered in an open freetext field and suggestions will be generated based on the input.

Usage

page_first_freetext(
  is_interview = FALSE,
  aggregate_score_threshold = 0.535,
  ...
)
page_first_freetext(
  is_interview = FALSE,
  aggregate_score_threshold = 0.535,
  ...
)

Arguments

`is_interview`	Should the page show slightly different / additional instructions and answer options for an interview that is conducted by another person? Defaults to FALSE.
`aggregate_score_threshold`	The total sum of the scores of the suggestions has to be higher than this threshold for suggestions to be shown. The parameter is passed on to `get_job_suggestions()`.
`...`	All additional parameters are passed to `new_page()`

Value

A page object.

Examples



## Not run: 
my_questionnaire <- list(
  page_first_freetext(),
  page_second_freetext(),
  page_select_suggestion(),
  page_none_selected_freetext(),
  page_followup(1),
  page_followup(2)
)
if (interactive()) {
  app(questionnaire = my_questionnaire)
}

## End(Not run)
## Not run: 
my_questionnaire <- list(
  page_first_freetext(),
  page_second_freetext(),
  page_select_suggestion(),
  page_none_selected_freetext(),
  page_followup(1),
  page_followup(2)
)
if (interactive()) {
  app(questionnaire = my_questionnaire)
}

## End(Not run)

Show potential followup questions to the user.

Description

To disambiguate between similar occupations. Depending on the suggestion, multiple followup questions can be shown.

Usage

page_followup(index, is_interview = FALSE, ...)
page_followup(index, is_interview = FALSE, ...)

Arguments

`index`	The index of the followup question (1-based). To show the first followup question (if there are any) use page_followup(index = 1), to show a potential second followup question use page_followup(index = 2). For example `questionnaire_web_survey()` uses `⁠..., page_followup(index = 1), page_followup(index = 2), ...⁠`
`is_interview`	Should the page show slightly different / additional instructions and answer options for an interview that is conducted by another person? Defaults to FALSE.
`...`	All additional parameters are passed to `new_page()`

Value

A page object.

Examples



## Not run: 
my_questionnaire <- list(
  page_first_freetext(),
  page_second_freetext(),
  page_select_suggestion(),
  page_none_selected_freetext(),
  page_followup(1),
  page_followup(2)
)
if (interactive()) {
  app(questionnaire = my_questionnaire)
}

## End(Not run)
## Not run: 
my_questionnaire <- list(
  page_first_freetext(),
  page_second_freetext(),
  page_select_suggestion(),
  page_none_selected_freetext(),
  page_followup(1),
  page_followup(2)
)
if (interactive()) {
  app(questionnaire = my_questionnaire)
}

## End(Not run)

Show a page with a text field where free text can be entered.

Description

Show a page with a text field where free text can be entered.

Usage

page_freetext(
  page_id,
  question_text = "Please enter your answer in the box below",
  is_interview = FALSE,
  no_answer_checkbox = TRUE,
  next_button = TRUE,
  previous_button = TRUE,
  trigger_next_on_enter = TRUE,
  render_question_text = TRUE,
  run_before = NULL,
  run_after = NULL,
  ...
)
page_freetext(
  page_id,
  question_text = "Please enter your answer in the box below",
  is_interview = FALSE,
  no_answer_checkbox = TRUE,
  next_button = TRUE,
  previous_button = TRUE,
  trigger_next_on_enter = TRUE,
  render_question_text = TRUE,
  run_before = NULL,
  run_after = NULL,
  ...
)

Arguments

`page_id`	A unique string identifiying this page. Used to store data.
`question_text`	The question / text to display. This can be either a string, which will simply be displayed or a function to dynamically determine the question_text.
`is_interview`	Should the page show slightly different / additional instructions and answer options for an interview that is conducted by another person? Defaults to FALSE.
`no_answer_checkbox`	Whether to provide a checkbox to denote that no answer has been provided.
`next_button`	Whether to show the button to navigate to the next page? Defaults to TRUE.
`previous_button`	Whether to show the button to navigate to the preivous page? Defaults to TRUE.
`trigger_next_on_enter`	Whether the next button is triggered when one presses enter. Defaults to TRUE. There are known issues with IE11.
`render_question_text`	Whether the question text should be displayed? Only set this to FALSE, if you wish to change the rendering of the question_text by e.g. using `render_before`. Defaults to TRUE.
`run_before`	Similar to `run_before` in `new_page()`, passed explicitly here as this page adds some of its own code to `run_before`.
`run_after`	Similar to `run_after` in `new_page()`, passed explicitly here as this page adds some of its own code to `run_after`.
`...`	Other parametrs are passed on to `new_page()`

Value

A page object.

Examples



## Not run: 
page_freetext(
  "test_page_freetext",
  question_text = "Hello there! Please fill in your name below:",
  no_answer_checkbox = TRUE
)

## End(Not run)
## Not run: 
page_freetext(
  "test_page_freetext",
  question_text = "Hello there! Please fill in your name below:",
  no_answer_checkbox = TRUE
)

## End(Not run)

An additional freetext page to show when no suggestion has been selected.

Description

An additional freetext page to show when no suggestion has been selected.

Usage

page_none_selected_freetext(is_interview = FALSE, ...)
page_none_selected_freetext(is_interview = FALSE, ...)

Arguments

`is_interview`	Should the page show slightly different / additional instructions and answer options for an interview that is conducted by another person? Defaults to FALSE.
`...`	All additional parameters are passed to `new_page()`

Value

A page object.

Examples



## Not run: 
my_questionnaire <- list(
  page_first_freetext(),
  page_second_freetext(),
  page_select_suggestion(),
  page_none_selected_freetext(),
  page_followup(1),
  page_followup(2)
)
if (interactive()) {
  app(questionnaire = my_questionnaire)
}

## End(Not run)
## Not run: 
my_questionnaire <- list(
  page_first_freetext(),
  page_second_freetext(),
  page_select_suggestion(),
  page_none_selected_freetext(),
  page_followup(1),
  page_followup(2)
)
if (interactive()) {
  app(questionnaire = my_questionnaire)
}

## End(Not run)

Page showing the user's results

Description

This page is only meant for demonstration purposes. Users can see what they entered and which code was being saved. The page is only included in the questionnaire_demo(), but not in the other questionnaire templates.

Usage

page_results(...)
page_results(...)

Arguments

...

All additional parameters are passed to new_page()

Value

A page object.

Examples



## Not run: 
my_questionnaire <- list(
  page_first_freetext(),
  page_second_freetext(),
  page_select_suggestion(),
  page_none_selected_freetext(),
  page_followup(1),
  page_followup(2),
  page_results()
)
if (interactive()) {
  app(questionnaire = my_questionnaire)
}

## End(Not run)
## Not run: 
my_questionnaire <- list(
  page_first_freetext(),
  page_second_freetext(),
  page_select_suggestion(),
  page_none_selected_freetext(),
  page_followup(1),
  page_followup(2),
  page_results()
)
if (interactive()) {
  app(questionnaire = my_questionnaire)
}

## End(Not run)

An optional, second free text question if the first didn't yield suggestions.

Description

If the first freetext question didn't provide satisfactory results, ask for more details and try again.

Usage

page_second_freetext(
  combine_input_with_first = TRUE,
  is_interview = FALSE,
  aggregate_score_threshold = 0.02,
  ...
)
page_second_freetext(
  combine_input_with_first = TRUE,
  is_interview = FALSE,
  aggregate_score_threshold = 0.02,
  ...
)

Arguments

`combine_input_with_first`	Should input be combined with the previous question?
`is_interview`	Should the page show slightly different / additional instructions and answer options for an interview that is conducted by another person? Defaults to FALSE.
`aggregate_score_threshold`	The total sum of the scores of the suggestions has to be higher than this threshold for suggestions to be shown. The parameter is passed on to `get_job_suggestions()`.
`...`	All additional parameters are passed to `new_page()`

Value

A page object.

Examples



## Not run: 
my_questionnaire <- list(
  page_first_freetext(),
  page_second_freetext(),
  page_select_suggestion(),
  page_none_selected_freetext(),
  page_followup(1),
  page_followup(2)
)
if (interactive()) {
  app(questionnaire = my_questionnaire)
}

## End(Not run)
## Not run: 
my_questionnaire <- list(
  page_first_freetext(),
  page_second_freetext(),
  page_select_suggestion(),
  page_none_selected_freetext(),
  page_followup(1),
  page_followup(2)
)
if (interactive()) {
  app(questionnaire = my_questionnaire)
}

## End(Not run)

Display the generated suggestions for the user to pick one.

Description

Display the generated suggestions for the user to pick one.

Usage

page_select_suggestion(is_interview = FALSE, ...)
page_select_suggestion(is_interview = FALSE, ...)

Arguments

`is_interview`	Should the page show slightly different / additional instructions and answer options for an interview that is conducted by another person? Defaults to FALSE.
`...`	All additional parameters are passed to `new_page()`

Value

A page object.

Examples



## Not run: 
my_questionnaire <- list(
  page_first_freetext(),
  page_second_freetext(),
  page_select_suggestion(),
  page_none_selected_freetext(),
  page_followup(1),
  page_followup(2)
)
if (interactive()) {
  app(questionnaire = my_questionnaire)
}

## End(Not run)
## Not run: 
my_questionnaire <- list(
  page_first_freetext(),
  page_second_freetext(),
  page_select_suggestion(),
  page_none_selected_freetext(),
  page_followup(1),
  page_followup(2)
)
if (interactive()) {
  app(questionnaire = my_questionnaire)
}

## End(Not run)

Welcome Page (optional)

Description

Providing an introduction and greeting participants.

Usage

page_welcome(
  title = "Herzlich Willkommen zum Modul zur automatischen Berufskodierung!",
  ...
)
page_welcome(
  title = "Herzlich Willkommen zum Modul zur automatischen Berufskodierung!",
  ...
)

Arguments

`title`	The heading with which to greet participants.
`...`	All additional parameters are passed to `new_page()`

Value

A page object.

Examples



## Not run: 
my_questionnaire <- list(page_welcome)
if (interactive()) {
  app(questionnaire = my_questionnaire)
}

## End(Not run)
## Not run: 
my_questionnaire <- list(page_welcome)
if (interactive()) {
  app(questionnaire = my_questionnaire)
}

## End(Not run)

Preprocess a string, removing special characters and handling abbreviations.

Description

Replace some common characters / character sequences (e.g., Ä, Ü, "DIPL.-ING.") with their uppercase equivalents and removes punctuation, empty spaces and the word "Diplom".

Usage

preprocess_string(verbatim, lang = "de")
preprocess_string(verbatim, lang = "de")

Arguments

`verbatim`	The character vector to process.
`lang`	The language the text is in. Currently only German is supported. Defaults to "de" (German).

Details

charToRaw() helps to find UTF-8 characters.

Value

The same character vector after processing

Examples



## Not run: 
preprocess_string(c(
  "Verkauf von B\u00fcchern, Schreibwaren",
  "Fach\u00e4rztin f\u00fcr Kinder- und Jugendmedizin im \u00f6ffentlichen Gesundheitswesen",
  "Industriemechaniker",
  "Dipl.-Ing. - Agrarwirtschaft (Landwirtschaft)"
))

## End(Not run)
## Not run: 
preprocess_string(c(
  "Verkauf von B\u00fcchern, Schreibwaren",
  "Fach\u00e4rztin f\u00fcr Kinder- und Jugendmedizin im \u00f6ffentlichen Gesundheitswesen",
  "Industriemechaniker",
  "Dipl.-Ing. - Agrarwirtschaft (Landwirtschaft)"
))

## End(Not run)

Pretrained ML models to be used with the package.

Description

Pretrained ML models to be used with the package.

Usage

pretrained_models
pretrained_models

Format

A nested list with pretrained machine learning models:

similarity_based_reasoning: list. Contains pretrained models to be used with algo_similarity_based_reasoning().
similarity_based_reasoning$wordwise: list. Contains the pretrained model to be used for providing suggestions using full wordwise matching.
similarity_based_reasoning$substring: list. Contains the pretrained model to be used for providing suggestions using substring matching.

This training data always predicts a 5-digit code from the 2010 German Classification of Occupations, with some exceptions: -0004 stands for 'Not precise enough/uncodable', -0006 stands for 'Multiple Jobs', -0012 stands for 'Blue-collar workers', -0019 stands for 'Volunteer/Social Service', and -0030 stands for 'Student assistant'.

Source

Data from the following surveys were pooled:

Antoni, M., Drasch, K., Kleinert, C., Matthes, B., Ruland, M. and Trahms, A. (2010): Arbeiten und Lernen im Wandel * Teil 1: Überblick über die Studie, FDZ-Methodenreport 05/2010, Forschungsdatenzentrum der Bundesagentur für Arbeit im Institut für Arbeitsmarkt- und Berufsforschung, Nuremberg.

Rohrbach-Schmidt, D., Hall, A. (2013): BIBB/BAuA Employment Survey 2012, BIBB-FDZ Data and Methodological Reports Nr. 1/2013. Version 4.1, Federal Institute for Vocational Education and Training (Research Data Centre), Bonn.

Lange, C., Finger, J., Allen, J., Born, S., Hoebel, J., Kuhnert, R., Müters, S., Thelen, J., Schmich, P., Varga, M., von der Lippe, E., Wetzstein, M., Ziese, T. (2017): Implementation of the European Health Interview Survey (EHIS) into the German Health Update (GEDA), Archives of Public Health, 75, 1–14.

Hoffmann, R., Lange, M., Butschalowsky, H., Houben, R., Schmich, P., Allen, J., Kuhnert, R., Schaffrath Rosario, A., Gößwald, A. (2018): KiGGS Wave 2 Cross-Sectional Study—Participant Acquisition, Response Rates and Representativeness, Journal of Health Monitoring, 3, 78–91. (only wave 2)

Trappmann, M., Beste, J., Bethmann, A., Müller, G. (2013): The PASS Panel Survey after Six Waves, Journal for Labour Market Research, 46, 275–281. (only wave 10)

Job titles were taken from the following publication:

Bundesagentur für Arbeit (2019). Gesamtberufsliste der Bundesagentur für Arbeit. Stand: 03.01.2019. https://download-portal.arbeitsagentur.de/files/.

Basically, leaving some data anonymization steps aside, we count for each job title from the Gesamtberufsliste (and some additional titles/texts) how many responses from all surveys are similar to this job title, separately for each coded category. Similarity is calculated in two ways, implying that we obtain two different counts: SubstringSimilarity refers to situations where the job title from the Gesamtberufsliste is a substring of the verbal answer; WordwiseSimilarity refers to situations where a word from the verbal answer is identical to a job title from the Gesamtberufsliste, except that one character from this word is allowed to change (Levenshtein distance). These counts are available as two separate files in the data-raw/training-data/ directory of this package. The algorithm to create these counts is available inside an R-package at https://github.com/malsch/occupationCoding, along with further documentation.

train_similarity_based_reasoning() is then used to train the ML models. See data-raw/pretrained_models.R for the raw counts and further details.

A demo questionnaire with additional explanations

Description

View the function's code itself to see the used pages.

Usage

questionnaire_demo(show_feedback_page = TRUE)
questionnaire_demo(show_feedback_page = TRUE)

Arguments

show_feedback_page

Show the page_feedback() to evaluate the fit of the chosen suggestion.

Details

Note, that this function has more complex code to create the additional pages.

Value

A questionnaire for app() i.e. a list of pages.

Examples



## Not run: 
# Inspect the code to create the questionnaire_demo
print(questionnaire_demo)

if (interactive()) {
  # Run the app with the questionnaire_demo
  app(questionnaire = questionnaire_demo())
}

## End(Not run)
## Not run: 
# Inspect the code to create the questionnaire_demo
print(questionnaire_demo)

if (interactive()) {
  # Run the app with the questionnaire_demo
  app(questionnaire = questionnaire_demo())
}

## End(Not run)

A questionnaire for interviewer-administered surveys

Description

A questionnaire for Computer-assisted Interviewing (CAI), i.e. telephone interviewing or personal interviewing. In both modes, interviewer asks questions to an interviewee.

Usage

questionnaire_interviewer_administered(show_feedback_page = TRUE)
questionnaire_interviewer_administered(show_feedback_page = TRUE)

Arguments

show_feedback_page

Show the page_feedback() to evaluate the fit of the chosen suggestion.

Details

View the function's code to see the used pages. This function is meant as a template that can be changed to meet your requirements.

Value

A questionnaire for app() i.e. a list of pages.

Examples



## Not run: 
# Inspect the code to create the questionnaire_interviewer_administered
print(questionnaire_interviewer_administered)

if (interactive()) {
  # Run the app with the questionnaire_interviewer_administered
  app(questionnaire = questionnaire_interviewer_administered())
}

## End(Not run)
## Not run: 
# Inspect the code to create the questionnaire_interviewer_administered
print(questionnaire_interviewer_administered)

if (interactive()) {
  # Run the app with the questionnaire_interviewer_administered
  app(questionnaire = questionnaire_interviewer_administered())
}

## End(Not run)

A web survey which participants can navigate themselves.

Description

The basic default questionnaire. View the function's code to see the used pages. This function is meant as a template that can be changed to meet your requirements.

Usage

questionnaire_web_survey(show_feedback_page = TRUE)
questionnaire_web_survey(show_feedback_page = TRUE)

Arguments

show_feedback_page

Show the page_feedback() to evaluate the fit of the chosen suggestion.

Value

A questionnaire for app(), i.e. a list of pages.

Examples



## Not run: 
# Inspect the code to create the questionnaire_web_survey
print(questionnaire_web_survey)

if (interactive()) {
  # Run the app with the questionnaire_web_survey
  app(questionnaire = questionnaire_web_survey())
}

if (interactive()) {
  # This is used by default within app
  app()
}

## End(Not run)
## Not run: 
# Inspect the code to create the questionnaire_web_survey
print(questionnaire_web_survey)

if (interactive()) {
  # Run the app with the questionnaire_web_survey
  app(questionnaire = questionnaire_web_survey())
}

if (interactive()) {
  # This is used by default within app
  app()
}

## End(Not run)

Set / save data for an item.

Description

There can be multiple items on any given page. Items can be different questions, or multiple variables that need to be saved from a single question. The question_text is typically saved in run_before and the reply (response_text and/or response_id) is typically saved in run_after.

Usage

set_item_data(
  session,
  page_id,
  item_id = NULL,
  question_text = NULL,
  response_text = NULL,
  response_id = NULL
)
set_item_data(
  session,
  page_id,
  item_id = NULL,
  question_text = NULL,
  response_text = NULL,
  response_id = NULL
)

Arguments

`session`	The shiny session
`page_id`	The page for which to retrieve data.
`item_id`	The item for which to set/update data. This has to be different for different items on the same page. Since most pages contain only a single question/item, `item_id` is set to "default" if missing.
`question_text`	The question's text. (optional)
`response_text`	The user's response in text form. (optional)
`response_id`	The user's response as an id from a set of choices. (optional)

Value

nothing

Examples



## Not run: 
# Set up a "fake" shiny session to store data
session <- shiny::MockShinySession$new()
session$userData <- list(
  current_page_id = "other_page",
  questionnaire_data = list(
    example_page = list()
  )
)

# This code is expected to be run in e.g. run_before or run_after
# It doesn't really make sense to run this code outside
set_item_data(
  session = session,
  page_id = "example_page",
  question_text = "How are you?"
)

set_item_data(
  session = session,
  page_id = "example_page",
  response_id = 3,
  response_text = "I'm doing great! (response_id = 3)"
)

## End(Not run)
## Not run: 
# Set up a "fake" shiny session to store data
session <- shiny::MockShinySession$new()
session$userData <- list(
  current_page_id = "other_page",
  questionnaire_data = list(
    example_page = list()
  )
)

# This code is expected to be run in e.g. run_before or run_after
# It doesn't really make sense to run this code outside
set_item_data(
  session = session,
  page_id = "example_page",
  question_text = "How are you?"
)

set_item_data(
  session = session,
  page_id = "example_page",
  response_id = 3,
  response_text = "I'm doing great! (response_id = 3)"
)

## End(Not run)

Train Similarity Based Probability Model with anonymized training data

Description

This function requires the mvtnorm package.

Usage

train_similarity_based_reasoning(
  anonymized_data,
  num_allowed_codes = 1291,
  coding_index_w_codes,
  coding_index_without_codes = NULL,
  preprocessing = list(stopwords = NULL, stemming = NULL, strPreprocessing = TRUE,
    removePunct = FALSE),
  dist_type = c("wordwise", "substring", "fulltext"),
  dist_control = list(method = "osa", weight = c(d = 1, i = 1, s = 1, t = 1)),
  threshold = c(max = 3, use = 1),
  simulation_control = list(n.draws = 250, check_normality = FALSE)
)
train_similarity_based_reasoning(
  anonymized_data,
  num_allowed_codes = 1291,
  coding_index_w_codes,
  coding_index_without_codes = NULL,
  preprocessing = list(stopwords = NULL, stemming = NULL, strPreprocessing = TRUE,
    removePunct = FALSE),
  dist_type = c("wordwise", "substring", "fulltext"),
  dist_control = list(method = "osa", weight = c(d = 1, i = 1, s = 1, t = 1)),
  threshold = c(max = 3, use = 1),
  simulation_control = list(n.draws = 250, check_normality = FALSE)
)

Arguments

`anonymized_data`	`surveyCountsSubstringSimilarity` or `surveyCountsWordwiseSimilarity`
`num_allowed_codes`	the number of allowed codes in the target classification. There are 1286 categories in the KldB 2010 plus 5 special codes in both anonymized training data sets, so the default value is 1291.
`coding_index_w_codes`	a data.table with columns bezMale a character vector, contains masculine job titles from the coding index. bezFemale a character vector, contains feminine job titles from the coding index. Code a character vector with associated classification codes.
`coding_index_without_codes`	(not used, but automatically determined) Any words from `anonymized_data$dictString` that are not found within `coding_index_w_codes` belong into this character vector.
`preprocessing`	a list with elements stopwords a character vector, use `tm::stopwords("de")` for German stopwords. Only used if `dist_type = "wordwise"`. stemming `NULL` for no stemming and `"de"` for stemming using the German porter stemmer. Do not use unless the job titles in `coding_index_w_codes` were stemmed. strPreprocessing `TRUE` if `preprocess_string` shall be used. removePunct `TRUE` if `removePunctuation` shall be used.
`dist_type`	How to calculate similarity between entries from both coding_indices and verbal answers from the survey? Three options are currently supported. Since we use the `stringdist`-function excessively, one could easily extend the functionality of this procedure to other distance metrics. dist_type = "fulltext" Uses the `stringdist`-function directly after preprocessing to calculate distances. (the simplest approach but least useful.) dist_type = "substring" An entry from the coding index and a verbal answer are similar if the entry from the coding index is a substring of the verbal answer. dist_type = "wordwise" After preprocessing, split the verbal answer into words. Then calculate for each word separately the the similarity with entries from the coding index, using `stringdist`. Not the complete verbal answer but only the words (0 or more) that have highest similarity are then used to determine similarity with entries from the coding index.
`dist_control`	If `dist_type = "fulltext"` or `dist_type = "wordwise"` the entries from this list will be passed to `stringdist`. Currently only two possible entries are supported (method = "osa", weight = c(d = 1, i = 1, s = 1, t = 1) is recommended), but one could easily extend the functionality.
`threshold`	A numeric vector with two elements. If `dist_type = "fulltext"` or `dist_type = "wordwise"`, the threshold determines up to which distance a verbal answer and an entry from the coding index are similar. The second number actually gets used. The first number is only used to speed up similarity calculations. It should be identical or larger than the second number.
`simulation_control`	a list with two components, n.draws Number of draws from the posterior distribution to determine posterior predictive probabilities. The larger, the more precise the results will be. check_normality We would like that the hyperprior distribution is normal. Set check_normality to TRUE to do some diagnostics about this.

Value

a list with components

prediction.datasets$modelProb: Contains all entries from the coding index. dist = "official" if the entry stems from coding_index_w_codes and dist = selfcreated if the entry stems from coding_index_without_codes. string.prob is used for weighting purposes (model averaging) if a new verbal answer is similar to multiple strings. unobserved.mean.theta gives a probability (usually very low) for any category that was not observed in the training data together with this string.
prediction.datasets$categoryProb: mean.theta is the probability for code given that an incoming verbal answer is similar to string. Only available if this code was at least a single time observed with this string (Use unobserved.mean.theta otherwise).
num_allowed_codes: Number of categories in the classification.
preprocessing: The input parameter stored to replicate preprocessing with incoming data.
dist_type: The input parameter stored to replicate distance calculations with incoming data.
dist_control: The input parameter stored to replicate distance calculations with incoming data.
threshold: The input parameter stored to replicate distance calculations with incoming data.
simulation_control: The input parameters controlling the Monte Carlo simulation.

References

Schierholz, Malte (2019): New methods for job and occupation classification. Dissertation, Mannheim. https://madoc.bib.uni-mannheim.de/50617/, pp. 206-208 and p. 268, pp. 308-320

https://github.com/malsch/occupationCoding (function trainSimilarityBasedReasoning2 is implemented here)

Package 'occupationMeasurement'

Help Index

Make suggestions using similarity based reasoning.

Description

Usage

Arguments

Value

References

See Also

Examples

Start the occupation coding API.

Description

Usage

Arguments

Value

See Also

Examples

Get an instance of the interactive shiny occupation coding app.

Description

Usage

Arguments

Value

See Also

Examples

German Auxiliary Classification of Occupations (AuxCO) v1.2.3

Description

Usage

Format

References

See Also

Go to the next page

Description

Usage

Arguments

Value

Functions

See Also

Examples

Create app_settings.

Description

Usage

Arguments

Value

Examples

Get the final occupation codes

Description

Usage

Arguments

Details

Value

Examples

Get potential follow-up questions for a suggestion.

Description

Usage

Arguments

Value

Examples

Retrieve data for an item.

Description

Usage

Arguments

Value

See Also

Examples

Make coding suggestions based on a user's open-ended text input.

Description

Usage

Arguments

Details

Value

Examples

Convenience function to aggregate all saved results_overview files.

Description

Usage

Arguments

Value

Examples

Categories of the The International Standard Classification of Occupations - ISCO-08

Description

Usage