Package: cleaningtools 0.0.0.9003

Yann Say

cleaningtools: cleaningtools package focuses on data cleaning

The cleaningtools package focuses on cleaning, and has three components: **Check**, which includes a set of functions that flag values, such as check_outliers and check_logical. **Create**, which includes a set of functions to create different items for use in cleaning, such as the cleaning log from the checks, clean data, and enumerator performance. **Review**, which includes a set of functions to review the cleaning.

Authors:Mehedi Khan [aut], Yann Say [cre, aut]

cleaningtools_0.0.0.9003.tar.gz
cleaningtools_0.0.0.9003.zip(r-4.5)cleaningtools_0.0.0.9003.zip(r-4.4)cleaningtools_0.0.0.9003.zip(r-4.3)
cleaningtools_0.0.0.9003.tgz(r-4.4-any)cleaningtools_0.0.0.9003.tgz(r-4.3-any)
cleaningtools_0.0.0.9003.tar.gz(r-4.5-noble)cleaningtools_0.0.0.9003.tar.gz(r-4.4-noble)
cleaningtools_0.0.0.9003.tgz(r-4.4-emscripten)cleaningtools_0.0.0.9003.tgz(r-4.3-emscripten)
cleaningtools.pdf |cleaningtools.html
cleaningtools/json (API)

# Install 'cleaningtools' in R:
install.packages('cleaningtools', repos = c('https://humanitarian-user-group.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/impact-initiatives/cleaningtools/issues

Datasets:

On CRAN:

4.14 score 3 stars 29 scripts 40 exports 41 dependencies

Last updated 6 hours agofrom:f917e60711. Checks:OK: 3 NOTE: 4. Indexed: no.

TargetResultDate
Doc / VignettesOKNov 25 2024
R-4.5-winOKNov 25 2024
R-4.5-linuxOKNov 25 2024
R-4.4-winNOTENov 25 2024
R-4.4-macNOTENov 25 2024
R-4.3-winNOTENov 25 2024
R-4.3-macNOTENov 25 2024

Exports:%>%add_durationadd_duration_from_auditadd_info_to_cleaning_logadd_percentage_missingauto_detect_sm_parentsauto_sm_parent_childrencheck_duplicatecheck_durationcheck_fcscheck_logicalcheck_logical_with_listcheck_otherscheck_others_checkscheck_outlierscheck_percentage_missingcheck_piicheck_soft_duplicatescheck_valuecoerce_to_charactercreate_audit_listcreate_clean_datacreate_cleaning_logcreate_col_rangecreate_combined_logcreate_duration_from_audit_sum_allcreate_duration_from_audit_with_start_endcreate_formated_wbcreate_formatted_choicescreate_logic_for_othercreate_validation_listcreate_xlsx_cleaning_logdetect_variablerecreate_parent_columnreview_cleaningreview_cleaning_logreview_othersreview_sample_frame_with_datasetverify_valid_choicesverify_valid_survey

Dependencies:assertthatcliclustercolorspacecpp11curldata.tabledplyrfansifarverfsgenericsgluejsonlitelabelinglifecyclemagrittrmunsellopenxlsxpillarpkgconfigpurrrR6randomcoloRRColorBrewerRcpprlangRtsnescalessnakecasestringistringrtibbletidyrtidyselectutf8V8vctrsviridisLitewithrzip

Readme and manuals

Help Manual

Help pageTopics
Creates a duration variable using the start and end time of the surveyadd_duration
Adds duration from the audit fileadd_duration_from_audit
Add information to the cleaning logadd_info_to_cleaning_log
Adds the percentage of missing values per rowadd_percentage_missing
Detect select multiple parent columnsauto_detect_sm_parents
detect and group together select multiple parent and children columnsauto_sm_parent_child auto_sm_parent_children
Checks for duplicated values in columnscheck_duplicate
Check if duration is outside of a rangecheck_duration
FCS component checkscheck_fcs
Check a logical testcheck_logical
Check several logical testcheck_logical_with_list
Generate a log for other follow up questionscheck_others
Check if the input passed to the check_others function is correctcheck_others_checks
check outliers over the datasetcheck_outliers
Check the percentages of missing valuecheck_percentage_missing
Checks for potential PIIcheck_pii
Checks for survey similarities - Soft Duplicatescheck_soft_duplicates
Check for value(s) in the datasetcheck_value
Analysis by population groupcleaningtools_analysis_by_group
Choices tab of kobo toolcleaningtools_choices
Clean datacleaningtools_clean_data
Cleaning logcleaningtools_cleaning_log
Dataset with food consumption, household hunger Score componentcleaningtools_food_consumption_df
Nation/all population level analysis.cleaningtools_overall_analysis
Raw datacleaningtools_raw_data
Sample framecleaningtools_sample_frame
Survey tab of kobo toolcleaningtools_survey
Coerce numeric values to character, without scientific noting and NA are kept as NA.coerce_to_character
Read all audit files from a zipcreate_audit_list
implement cleaning log on raw data set.create_clean_data
Compares 2 dataset and logs differencescreate_cleaning_log
Create a project folder with a cleaning templatecreate_cleaning_template
Generate excel range to be used for the data validation formula in excelcreate_col_range
Merging the cleaning logscreate_combined_log
Calculate duration from audit summing all timecreate_duration_from_audit_sum_all
Calculate duration from audit between 2 questionscreate_duration_from_audit_with_start_end
Creates formatted workbook with openxlsxcreate_formated_wb
Format and filter Choices for 'select_one' Questionscreate_formatted_choices
Create logical checks for "other" values.create_logic_for_other
Create a Validation List for Data Entrycreate_validation_list
Creates formatted excel for cleaning logcreate_xlsx_cleaning_log
detects variables names in codedetect_variable
This function recreates the columns for select multiple questionsrecreate_parent_column
Review cleaningreview_cleaning
Review cleaning logreview_cleaning_log
Review discrepancy between kobo relevancies and the dataset.review_others
Compares the sample frame with the clean datareview_sample_frame_with_dataset
Verify if the Kobo choices dataframe is validverify_valid_choices
Verify if the Kobo survey dataframe is validverify_valid_survey