Title: | Tools for working with ODK XLSForms |
---|---|
Description: | Set of utility functions for use by EcoHealth Alliance researchers in working with Open Data Kit XLSForms. These functions are aimed at aiding users in the data cleaning and data validation process using information found in the ODK XLSForms. |
Authors: | Ernest Guevarra [aut, cre] |
Maintainer: | Ernest Guevarra <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.0.0.9000 |
Built: | 2024-11-23 03:32:40 UTC |
Source: | https://github.com/ecohealthalliance/formstools |
An example ODK form schema retrieved using ruODK
form_codebook
form_codebook
A tibble with 7 columns and 10 rows:
| **Variable** | **Description** | | :— | :— | | *path* | XML path to question in ODK | | *name* | Variable name used in XForm | | *type* | Variable type | | *binary* | Are the values for the question binary? | | *ruodk_name* | Variable name used by ruODK | | *label* | Question used in XForm | | *choices* | Named list of choices for select_one and select_multiple questions |
form_codebook
form_codebook
Get choices used for select_one and select_multiple questions in an XLSForm
get_choices(xlsform, choice_name = NULL)
get_choices(xlsform, choice_name = NULL)
xlsform |
A character value for the path to the XLSForm |
choice_name |
A character value or vector of values of names given to the set of choices in the supplied/specified XLSForm. Default to NULL in which case all sets of choices are returned. |
A tibble of the 'list_name', 'name', and 'label' of the choices for the specified 'select_one' and 'select_multiple' type of questions in the specified XLSForm
## Get all choices get_choices( xlsform = system.file( "extdata", "ghana_community_form.xlsx", package = "formstools" ) ) ## Get choices for gender get_choices( xlsform = system.file( "extdata", "ghana_community_form.xlsx", package = "formstools" ), choice_name = "gender" )
## Get all choices get_choices( xlsform = system.file( "extdata", "ghana_community_form.xlsx", package = "formstools" ) ) ## Get choices for gender get_choices( xlsform = system.file( "extdata", "ghana_community_form.xlsx", package = "formstools" ), choice_name = "gender" )
Get choices for a specified variable in an ODK form
get_choices_ruodk(form_schema, var_name = NULL, choice_name = NULL)
get_choices_ruodk(form_schema, var_name = NULL, choice_name = NULL)
form_schema |
A form schema created by 'ruODK::form_scheme_ext()' |
var_name |
Specific variable name in ODK form to get choices of |
choice_name |
Specific variable name in 'form_schema' that holds the information on choices. Forms with multiple languages will have a 'form_schema' with separate columns for choices for each language. Default is NULL which assumes that only one language is used in the form. |
A tibble with 'var_name', 'question', 'values', and 'labels'
## Get all choices for select_one and select_multiple type questions get_choices_ruodk(form_schema = form_codebook) ## Get choices for pizza2 variable get_choices_ruodk(form_schema = form_codebook, var_name = "pizza2")
## Get all choices for select_one and select_multiple type questions get_choices_ruodk(form_schema = form_codebook) ## Get choices for pizza2 variable get_choices_ruodk(form_schema = form_codebook, var_name = "pizza2")
Get questions used for select_one and select_multiple questions in an XLSForm
get_questions(xlsform, choice_name = NULL)
get_questions(xlsform, choice_name = NULL)
xlsform |
A character value for the path to the XLSForm |
choice_name |
A character value or vector of values of names given to the set of choices in the supplied/specified XLSForm. Default to NULL in which case all 'select_one' and 'select_multiple' questions types are returned. |
A tibble of the 'type', 'list_name', 'name', and 'label' of the choices for the specified 'select_one' and 'select_multiple' type of questions in the specified XLSForm
## Get all select_one and select_multiple question types get_questions( xlsform = system.file( "extdata", "ghana_community_form.xlsx", package = "formstools" ) ) ## Get all select_one and select_multiple question types for gender list_name get_questions( xlsform = system.file( "extdata", "ghana_community_form.xlsx", package = "formstools" ), choice_name = "gender" )
## Get all select_one and select_multiple question types get_questions( xlsform = system.file( "extdata", "ghana_community_form.xlsx", package = "formstools" ) ) ## Get all select_one and select_multiple question types for gender list_name get_questions( xlsform = system.file( "extdata", "ghana_community_form.xlsx", package = "formstools" ), choice_name = "gender" )
Match other responses to specified choices
match_other_to_choices( form_schema, form_data, var_name, choice_name = NULL, other_var_name, id = "id" )
match_other_to_choices( form_schema, form_data, var_name, choice_name = NULL, other_var_name, id = "id" )
form_schema |
A form schema created by 'ruODK::form_scheme_ext()' |
form_data |
A tibble of the data retrieved from ODK Central using ruODK |
var_name |
Specific variable name in ODK form to get choices of |
choice_name |
Specific variable name in 'form_schema' that holds the information on choices. Forms with multiple languages will have a 'form_schema' with separate columns for choices for each language. Default is NULL which assumes that only one language is used in the form. |
other_var_name |
Variable name in 'form_data' for other response in 'var_name' |
id |
A character vector of variable names in 'form_data' to use as identifying data for the output |
A tibble with number of columns the same as the number of possible choices for 'var_name' in 'form_data' plus the number of values in 'id' used to identify each row of the output. The number of rows is equal to the number of rows of data in 'form_data'. Names of resulting tibble is the concatenation of 'recode_' and the choice values for 'var_name'
## Match other toppings responses (pizza3) to choices for ## pizza toppings (pizza2) match_other_to_choices( form_schema = form_codebook, form_data = pizza_data, var_name = "pizza2", other_var_name = "pizza3" )
## Match other toppings responses (pizza3) to choices for ## pizza toppings (pizza2) match_other_to_choices( form_schema = form_codebook, form_data = pizza_data, var_name = "pizza2", other_var_name = "pizza3" )
An example data collected using ODK and aggregated using ODK Central
pizza_data
pizza_data
A tibble with 21 columns and 17 rows:
| **Variable** | **Description** | | :— | :— | | *id* | Identifier | | *start* | Start date and time | | *end* | End date and time | | *today* | Date of data entry | | *pizza1* | Do you like pizza? | | *pizza2* | Pizza toppings you like (select multiple) | | *pizza3* | Any other toppings? | | *closing* | Closing statement | | *meta_audit* | Meta audit | | *meta_instance_id* | Meta instance ID | | *system_submission_date* | System submission date | | *system_updated_at* | System updated at | | *system_submitter_id* | System submitter ID | | *system_submitter_name* | System submitter name | | *system_attachments_present* | System attachments present | | *system_attachments_expected* | System attachments expected | | *system_status* | System status | | *system_review_state* | System review state | | *system_device_id* | System device ID | | *system_edits* | System edits | | *odata_context* | OData context |
pizza_data
pizza_data
Split a vector of values from an ODK select multiple type of response
split_multiple_response(x, sep = " ", fill, na_rm = FALSE, prefix) split_multiple_responses(x, sep = " ", fill, na_rm = FALSE, prefix)
split_multiple_response(x, sep = " ", fill, na_rm = FALSE, prefix) split_multiple_responses(x, sep = " ", fill, na_rm = FALSE, prefix)
x |
A select_multiple response or vector of responses with multiple responses |
sep |
Separator used to separate multiple responses. Default to " ". Regular expressions can be used to detect more than one possible separator. |
fill |
A vector of all the possible responses to the select multiple question |
na_rm |
Logical. Should an NA response be reported in its own column? Default to FALSE. |
prefix |
A character value for prefix to append to names of resulting data.frame. |
A tibble with number of columns the same as the number of possible choices for the select multiple question (same as lenght of 'fill') and number of rows equal to the length of 'x'. Names of resulting tibble is the concatenation of the 'prefix' and the values for 'fill'
## Split the multiple responses of pizza toppings split_multiple_responses( x = pizza_data$pizza2, sep = ", ", fill = c("cheese", "tomatoes", "pepperoni", "mushrooms", "artichoke", "olives", "pineapple", "other"), prefix = "toppings" )
## Split the multiple responses of pizza toppings split_multiple_responses( x = pizza_data$pizza2, sep = ", ", fill = c("cheese", "tomatoes", "pepperoni", "mushrooms", "artichoke", "olives", "pineapple", "other"), prefix = "toppings" )
Function transforms a vector of categorical responses into 'n' number of new columns/variables equal to the number of unique categorical values.
spread_vector_to_columns(x, fill = NULL, na_rm = FALSE, prefix)
spread_vector_to_columns(x, fill = NULL, na_rm = FALSE, prefix)
x |
Vector of categorical values |
fill |
A vector of all the possible responses to the select multiple question |
na_rm |
Logical. Should an NA response be reported in its own column? Default to FALSE. |
prefix |
A character string to prepend to the names of the new columns to be created |
A tibble with number of columns the same as the number of unique values 'x' and number of rows equal to the length of 'x'. Names of resulting tibble is the concatenation of the 'prefix' and the values of 'x'
spread_vector_to_columns( x = c("cat", "cat", "dog", "dog", "dog", NA_character_), fill = c("cat", "dog", "monkey"), na_rm = TRUE, prefix = "pets" )
spread_vector_to_columns( x = c("cat", "cat", "dog", "dog", "dog", NA_character_), fill = c("cat", "dog", "monkey"), na_rm = TRUE, prefix = "pets" )