Title: | A grammar of hypothesis test driven analysis |
---|---|
Description: | Quantitative analysis according to the IMPACT minimum standards. Accepts weights and input from kobo questionnaires. |
Authors: | Eliora Henzler [aut, cre] |
Maintainer: | Eliora Henzler <[email protected]> |
License: | GPL-3 |
Version: | 0.2.2 |
Built: | 2024-11-24 04:26:27 UTC |
Source: | https://github.com/mabafaba/hypegrammaR |
each repetion gets its own analysisplan row
analysisplan_expand_repeat(analysisplan, data)
analysisplan_expand_repeat(analysisplan, data)
Combine weight functions from two sampling frames
combine_weighting_functions(weight_function_1, weight_function_2)
combine_weighting_functions(weight_function_1, weight_function_2)
weight_function_1 |
first weighthing function |
weight_function_2 |
second weightng function |
returns a new function that takes a data frame as input returns a vector of weights corresponding to each row in the data frame.
Applies basic sanitation to data before summary statistics or hypothesis test can be applied
datasanitation_design(design, dependent.var, independent.var, sanitation_function)
datasanitation_design(design, dependent.var, independent.var, sanitation_function)
design |
the design object |
dependent.var |
a string containing the dependent variable in the analysis case |
independent.var |
a string containing the independent variable in the analysis case |
sanitation_function |
the function containing all the checks for the analysis function in question |
returns the cleaned data with a santation success or failure message
Takes all usual hypegrammaR input files plus an analysis plan and maps directly to an output document
from_analysisplan_map_to_output(data, analysisplan, weighting = NULL, cluster_variable_name = NULL, questionnaire = NULL, labeled = FALSE, verbose = TRUE, confidence_level = 0.95)
from_analysisplan_map_to_output(data, analysisplan, weighting = NULL, cluster_variable_name = NULL, questionnaire = NULL, labeled = FALSE, verbose = TRUE, confidence_level = 0.95)
data |
the data set as a data frame (load_data()) |
analysisplan |
the analysisplan (load_analysisplan()) |
weighting |
optional: the weighting function (use load_samplingframe() and then map_to_weighting()) |
cluster_variable_name |
optional: the name of the variable with the cluster IDs |
questionnaire |
optional: the questionnaire (load_questionnaire()) |
labeled |
do you want the resuts to display labels rather than xml names ? defaults to false, requires the questionnaire |
verbose |
should progress be printed to the console? (default TRUE, slightly faster if FALSE) |
confidence_level |
the confidence level to be used for confidence intervals (default: 0.95) |
returns a list of hypegrammaR "result" objects (see map_to_result())
Grouped barchart for percentages
grouped_barchart_percent(result)
grouped_barchart_percent(result)
A grammar of hypothesis driven analysis, following the idea that there is only one test
Supports integration of weighted data (using the survey package) and data collected with kobotoolbox, ODK or similar. Executes the three main steps of data analysis
summary statistics
hypothesis tests
preparation for visualisation
The user begins by loading the data, and if needed the questionnaire, analysisplan and sampling frame (as .csv files). To verify the correct format of these inputs, name
can be used.
All other functions then refer to these objects.
The two possible work-flows are: using the individual functions (in the blocks), and mapping to the resuls: using all the blocks automatically.
map_to_result
is the overall function that executes the three steps of data analysis
map_to_file
maps from the result to a csv or jpg file
from_analysisplan_map_to_output
applies map_to_result and map_to_file for all rows in the data analysis plan
Preparing your data
Summary statistics (examples)
mean_with_confints_select_one_groups
Hypothesis tests (examples)
Visualise (examples)
barchart_percent
,
gg_heatmap_generic
Perform a chi squared test on a select multiple question against a select one question.
hypothesis_test_chisquared_select_multiple(dependent.var, dependent.var.sm.cols, independent.var, design, questionnaire = NULL)
hypothesis_test_chisquared_select_multiple(dependent.var, dependent.var.sm.cols, independent.var, design, questionnaire = NULL)
dependent.var |
string with the column name in ‘data' of the dependent variable. Should be a ’select multiple'. |
design |
the svy design object created using map_to_design or directly with svydesign |
independen.var |
string with the column name in ‘data' of the independent variable. Should be a ’select one' with few (<15) categories. |
A list with the results of the test (Chi Squared statistics, p value) or the error message.
## Not run: hypothesis_test_chisquared_select_one("population_group", "resp_gender", design)
## Not run: hypothesis_test_chisquared_select_one("population_group", "resp_gender", design)
hypothesis_test_chisquared_select_one Perform a chi squared test on a select one question against another.
hypothesis_test_chisquared_select_one(dependent.var, independent.var, design)
hypothesis_test_chisquared_select_one(dependent.var, independent.var, design)
dependent.var |
string with the column name in ‘data' of the dependent variable. Should be a ’select one'. |
design |
the svy design object created using map_to_design or directly with svydesign |
independen.var |
string with the column name in ‘data' of the independent variable. Should be a ’select one' with few (<15) categories. |
A list with the results of the test (Chi Squared statistics, p value) or the error message.
## Not run: hypothesis_test_chisquared_select_one("population_group", "resp_gender", design)
## Not run: hypothesis_test_chisquared_select_one("population_group", "resp_gender", design)
Perform a one sample t test of one numerical variable against hypothesised value (limit)
hypothesis_test_t_one_sample(dependent.var, independent.var = NULL, limit, design)
hypothesis_test_t_one_sample(dependent.var, independent.var = NULL, limit, design)
dependent.var |
string with the column name in 'data' of the dependent variable. Should be numerical. |
independent.var |
should be null ! For other functions: string with the column name in 'data' of the independent variable |
limit |
the value to test the dependent.var against |
design |
the svy design object created using map_to_design or directly with svydesign |
A list with the results of the test (T-value, p value, etc.) or the error message.
## Not run: hypothesis_test_t_two_sample("males_13_15", 4, design)
## Not run: hypothesis_test_t_two_sample("males_13_15", 4, design)
Perform a two sample t test of one numerical variable across mutliple groups
hypothesis_test_t_two_sample(dependent.var, independent.var, design)
hypothesis_test_t_two_sample(dependent.var, independent.var, design)
dependent.var |
string with the column name in 'data' of the dependent variable. Should be numerical. |
design |
the svy design object created using map_to_design or directly with svydesign |
independen.var |
string with the column name in ‘data' of the independent variable. Should be a ’select one' with few (<15) categories. |
A list with the results of the test (T-value, p value, etc.) or the error message.
## Not run: hypothesis_test_t_two_sample("males_13_15", "resp_gender", design)
## Not run: hypothesis_test_t_two_sample("males_13_15", "resp_gender", design)
presentable p-value format
label_pvalue(x, digits = 3)
label_pvalue(x, digits = 3)
Add labels to results
labels_summary_statistic(summary.statistic, questionnaire, label.dependent.var.value = T, label.independent.var.value = T, label.dependent.var = T, label.independent.var = T, independent.linebreak = T, dependent.linebreak = F)
labels_summary_statistic(summary.statistic, questionnaire, label.dependent.var.value = T, label.independent.var.value = T, label.dependent.var = T, label.independent.var = T, independent.linebreak = T, dependent.linebreak = F)
questionnaire |
koboquest 'questionnaire' object; output from load_questionnaire() |
result |
hypegrammaR 'result' object; output from map_to_result(). |
if the Variable wasn't found in the questionnaire, or the choice wasn't found in the corresponding list of choices, the affected values will remain unchanged.
same as input, but with all variable values labeled
Load an analysis plan from a csv file
load_analysisplan(file = NULL, df = NULL)
load_analysisplan(file = NULL, df = NULL)
file |
path to a csv file with the analysis plan |
df |
alternative to 'file', you can provide the analysis plan as a data frame |
The analysis plan csv file must contain the following column headers: "repeat.for.variable","research.question", "sub.research.question", "hypothesis", "independent.variable", "dependent.variable", "hypothesis.type", "independent.variable.type", "dependent.variable.type". You can generate an empty template with
load asessment data
load_data(file)
load_data(file)
file |
path to a csv file with the assessment data |
the data _must_ be in standard kobo format with xml style headers.
the data from the csv files as data frame. Column header symbols are changed to lowercase alphanumeric and underscore; everything else is converted to a "."
load_questionnaire
load_questionnaire(data, questions, choices, choices.label.column.to.use = NULL)
load_questionnaire(data, questions, choices, choices.label.column.to.use = NULL)
data |
data frame containing the data matching the questionnaire to be loaded. |
questions |
data frame or file name of a csv file containing the kobo form's question sheet |
choices |
data frame or file name of a csv file containing the kobo form's choices sheet |
choices.label.column.to.use |
The choices csv file has (sometimes multiple) columns with labels. They are often called "Label::English" or similar. Here you need to provide the _name of the column_ that you want to use for labels (see example!) |
A list containing the original questionnaire questions and choices, the choices matched 1:1 with the data columns, and all functions created by this function relating to the specific questionnaire (they are written to the global space too, but you can use these when using multiple questionnaires in parallel.)
## Not run: load_questionnaire(mydata, questions.file="koboquestions.csv", choices.file="kobochoices.csv", choices.label.column.to.use="Label::English") ## End(Not run)
## Not run: load_questionnaire(mydata, questions.file="koboquestions.csv", choices.file="kobochoices.csv", choices.label.column.to.use="Label::English") ## End(Not run)
Load a sampling frame from csv
load_samplingframe(file)
load_samplingframe(file)
file |
the path and name of the sampling frame csv file to load. |
function loads the sampling frame and can be used to make weights ith map_to_weighting()
## Not run: sf <- load_samplingframe("./somefolder/samplingframe.csv")
## Not run: sf <- load_samplingframe("./somefolder/samplingframe.csv")
creates a string that other functions can use to know what analysis case they are dealing with
map_to_case(hypothesis.type, dependent.var.type = NULL, independent.var.type = NULL)
map_to_case(hypothesis.type, dependent.var.type = NULL, independent.var.type = NULL)
hypothesis.type |
The hypothesis type. Must be one of "group_difference" or "direct_reporting". |
dependent.var.type |
The type of the dependent variable as a string. must be either "numerical" or "categorical" |
independent.var.type |
The type of the independent variable as a string. must be either "numerical" or "categorical" |
a string that other functions can use to know what analysis case they are dealing with. It has a class "analysis_case" assigned
## Not run: map_to_case("group_difference","categorical","categorical")
## Not run: map_to_case("group_difference","categorical","categorical")
creates a 'survey' design object from the data
map_to_design(data, cluster_variable_name = NULL, weighting_function = NULL)
map_to_design(data, cluster_variable_name = NULL, weighting_function = NULL)
data |
the dataset as a sampling frame. Must match the sampling frame provided to create the 'weighting_function' produced with 'map_to_weighting()' |
weighting_function |
if cluster sampling was used, what's the name of the column in 'data' that identifies the cluster? |
create a 'survey' package design object from the data and information on the sampling strategy
a 'survey' package design object
## Not run: map_to_design(data,cluster_variable_name="cluster_id")
## Not run: map_to_design(data,cluster_variable_name="cluster_id")
Save outputs to files
map_to_file(object, filename, ...)
map_to_file(object, filename, ...)
object |
The object you want to save as a file |
filename |
The name of the file that is produced. The extension needs to match the type of object you want to save (csv for tables, jpg/pdf for images) |
the object that was given as input (unchanged).
## Not run: # some table: mytable<-data.frame(a=1:10,b=1:10) map_to_file(mytable,"mytable.csv") # some graphic made with ggplot: mygraphic<-ggplot(mytable,aes(a,b))+geom_point() map_to_file(mygraphic,"visualisation.jpg") map_to_file(mygraphic,"visualisation.pdf") ## End(Not run)
## Not run: # some table: mytable<-data.frame(a=1:10,b=1:10) map_to_file(mytable,"mytable.csv") # some graphic made with ggplot: mygraphic<-ggplot(mytable,aes(a,b))+geom_point() map_to_file(mygraphic,"visualisation.jpg") map_to_file(mygraphic,"visualisation.pdf") ## End(Not run)
html from resultlist with results in specified hierarchical order based on analysisplan
map_to_generic_hierarchical_html(resultlist, render_result_with, by_analysisplan_columns = c("dependent.var"), by_prefix = c("", "subset:", "variable:"), level = 2, questionnaire = NULL, label_varnames = TRUE, dir = "./", filename)
map_to_generic_hierarchical_html(resultlist, render_result_with, by_analysisplan_columns = c("dependent.var"), by_prefix = c("", "subset:", "variable:"), level = 2, questionnaire = NULL, label_varnames = TRUE, dir = "./", filename)
resultlist |
structure like the output from from_analysisplan_map_to_output: A list with two items "analysisplan" and "results": The "analysisplan" as a data frame, where each row must match a result in a list of "results" |
render_result_with |
a function that takes a single result as input and returns an rmarkdown formated string |
by_analysisplan_columns |
vector of strings matching column names of the analysisplan. The first element becomes the main heading, the second element the sub-heading etc. |
by_prefix |
a prefix added at the beginnig of the headline; same length as 'by_analysisplan_columns' |
level |
the markdown header level to start with; defaults to 2 which leads to "## heading", i.e. the second header level. |
questionnaire |
optional; the questionnaire (koboquest::load_questionnaire()) |
label_varnames |
wether variables names should be labeled in headings |
dir |
the directory in which to save the output file (absolute path or relative to current working directory) |
filename |
the name of the file. must end in '.html' |
type |
the type of report template to use. Currently one of "full", "visual" or "summary" |
selects an appropriate hypothesis test function based on the analysis case
map_to_hypothesis_test(design, dependent.var, independent.var, case, questionnaire = NULL, limit = NULL)
map_to_hypothesis_test(design, dependent.var, independent.var, case, questionnaire = NULL, limit = NULL)
case |
a string uniquely identifying the analysis case. output of map_to_case(). |
a _function_ that computes the relevant hypothesis test
Add labels to results
map_to_labeled(result, questionnaire)
map_to_labeled(result, questionnaire)
result |
hypegrammaR 'result' object; output from map_to_result(). |
questionnaire |
koboquest 'questionnaire' object; output from load_questionnaire() |
if the variable wasn't found in the questionnaire, or the choice wasn't found in the corresponding list of choices, the affected values will remain unchanged.
same as 'result' input, but with all variable values labeled
Make the master table of summary stats and hypothesis tests
map_to_master_table(results_object, filename, questionnaire = NULL)
map_to_master_table(results_object, filename, questionnaire = NULL)
results_object |
a list containing one or more hypegrammaR result objects: the output of map_to_result |
filename |
The name of the file that is produced. The extension needs to be ".csv". |
questionnaire |
optional: the questionnaire obtained by load_questionnaire. Necessary is you want labeled results |
a dataframe containing the summary statistics and p values for each element in results.
Produce summary statistics, hypothesis tests and plot objects for a hypothesis
map_to_result(data, dependent.var, independent.var = NULL, case, cluster.variable.name = NULL, weighting = function(df) { rep(1, nrow(df)) }, questionnaire = NULL, confidence_level = 0.95)
map_to_result(data, dependent.var, independent.var = NULL, case, cluster.variable.name = NULL, weighting = function(df) { rep(1, nrow(df)) }, questionnaire = NULL, confidence_level = 0.95)
data |
the data as a data.frame. Must match the sampling frame used to produce the 'weighting' as well as the questionnaire if applicable. |
dependent.var |
string with the column name in "data" of the dependent variable |
case |
the analysis case, created with map_to_case(). |
cluster.variable.name |
if cluster sampling, provide the name of the variable in the dataset that denotes the cluster |
weighting |
A function that generates weights from a dataframe. You can create it with surveyweights::weighting_fun_from_samplingframe() |
questionnaire |
output from load_questionnaire() |
confidence_level |
the confidence level to be used for confidence intervals (default: 0.95) |
independen.var |
string with the column name in 'data' of the independent variable |
- takes as parameters outputs from - load_data() - map_to_case() - load_samplingframe() - load_questionnaire() - output can be processed by: - map_to_labeled() - map_to_visualisation() - map_to_table() - map_to_master_table() - map_to_visualisation_heatmap()
A list with the summary.statistic the hypothesis.test result
selects an appropriate summary statistic function based on the analysis case
map_to_summary_statistic(design, dependent.var, independent.var, case, questionnaire = NULL, confidence_level = 0.95)
map_to_summary_statistic(design, dependent.var, independent.var, case, questionnaire = NULL, confidence_level = 0.95)
design |
the design object (map_to_design()) |
dependent.var |
the name of the dependent variable |
independent.var |
the name of the independent variable |
case |
a string uniquely identifying the analysis case. output of map_to_case(). |
questionnaire |
the questionnaire (from load_questionnaire()) |
confidence_level |
the confidence level to be used for confidence intervals (default: 0.95) |
a _function_ that computes the relevant summary statistic
## Not run: map_to_summary_statistic("group_difference_categorical_categorical") ## Not run: my_case<- map_to_case( ... ) my_sumstat <- map_to_summary_statistic(my_case) my_sumstat( ... ) ## End(Not run)
## Not run: map_to_summary_statistic("group_difference_categorical_categorical") ## Not run: my_case<- map_to_case( ... ) my_sumstat <- map_to_summary_statistic(my_case) my_sumstat( ... ) ## End(Not run)
Make the master table of summary stats
map_to_summary_table(results_object, filename, questionnaire = NULL)
map_to_summary_table(results_object, filename, questionnaire = NULL)
results_object |
a list containing one or more hypegrammaR result objects: the output of map_to_result |
filename |
The name of the file that is produced. The extension needs to be ".csv". |
questionnaire |
optional: the questionnaire obtained by load_questionnaire. Necessary is you want labeled results |
a dataframe containing the summary statistics for each element in results.
results as a table
map_to_table(result)
map_to_table(result)
result |
a hypegrammaR 'result' object produced by map_to_result |
a date frame with only the summary statistics
Map results to an output template
map_to_template(x, questionnaire = NULL, dir, type = NULL, filename, custom_template = NULL)
map_to_template(x, questionnaire = NULL, dir, type = NULL, filename, custom_template = NULL)
x |
hypegrammaR result or list of results (created with map_to_result() or from_analysisplan_map_to_output()) |
questionnaire |
optional: the questionnaire (load_questionnaire()) |
dir |
the directory in which to save the output file (absolute path or relative to current working directory) |
type |
the type of report template to use, as a string. Currently one of "full", "visual" or "summary". Can be omitted if custom template is used |
filename |
the name of the file. must end in '.html' |
custom_template |
optional: the full path to the custom template to use (must be an RMD file in the templates folder) |
selects an appropriate visualisation function based on the analysis case
map_to_visualisation(result)
map_to_visualisation(result)
result |
a result object containing the summary statistics and hypothesis tests for the case. |
a _function_ that creates the relevant ggplot object
## Not run: map_to_visualisation("result_var1") ## Not run: result_var1<- map_to_result( ... ) my_vis_fun <- map_to_visualisation(result_var1) my_ggplot_obj<-my_vis_fun( ... ) my_ggplot_obj # plots the object ## End(Not run)
## Not run: map_to_visualisation("result_var1") ## Not run: result_var1<- map_to_result( ... ) my_vis_fun <- map_to_visualisation(result_var1) my_ggplot_obj<-my_vis_fun( ... ) my_ggplot_obj # plots the object ## End(Not run)
Heatmaps from 'result' objects
map_to_visualisation_heatmap(result)
map_to_visualisation_heatmap(result)
result |
a hypegrammaR result object (can be made with map_to_result()) |
to add labels, use 'myresult
A hypegrammaR visualisation object, which is a list with two elements, 1) a ggplot object and 2) recommended parameters to pass to ggsave.
creates a weighting function from a sampling frame
map_to_weighting(sampling.frame, data.stratum.column, sampling.frame.population.column = "population", sampling.frame.stratum.column = "stratum", data = NULL)
map_to_weighting(sampling.frame, data.stratum.column, sampling.frame.population.column = "population", sampling.frame.stratum.column = "stratum", data = NULL)
data.stratum.column |
data column name that holds the record's strata names |
sampling.frame.population.column |
sampling frame name of column holding population counts. defaults to "population" |
sampling.frame.stratum.column |
sampling frame name of column holding stratum names. defaults to "stratum". Stratum names must match exactly values in: |
data |
optional but recommended: you can provide an example data frame of data supposed to match the sampling frame to check if the provided variable names match and whether all strata in the data appear in the sampling frame. |
sampling.frame.file |
data frame containing the sampling frame. should contain columns "stratum" and "population", otherwise column names must be specified. |
returns a new function that takes a data frame as input returns a vector of weights corresponding to each row in the data frame.
## Not run: # load data and sampling frames: mydata<-read.csv("mydata.csv") mysamplingframe<-read.csv("mysamplingframe.csv") # create weighting function: weighting<-weighting_fun_from_samplingframe(sampling.frame = mysamplingframe, data.stratum.column = "strata_names", sampling.frame.population.column = "pop", sampling.frame.stratum.column = "strat_name") # use weighting function: mydata$weights<-weighting(mydata) # this also works on subsets of the data: mydata_subset<-mydata[1:100,] subset_weights<- weighting(mydata) ## End(Not run)
## Not run: # load data and sampling frames: mydata<-read.csv("mydata.csv") mysamplingframe<-read.csv("mysamplingframe.csv") # create weighting function: weighting<-weighting_fun_from_samplingframe(sampling.frame = mysamplingframe, data.stratum.column = "strata_names", sampling.frame.population.column = "pop", sampling.frame.stratum.column = "strat_name") # use weighting function: mydata$weights<-weighting(mydata) # this also works on subsets of the data: mydata_subset<-mydata[1:100,] subset_weights<- weighting(mydata) ## End(Not run)
Weighted means with confidence intervals
mean_with_confints(dependent.var, independent.var = NULL, design, confidence_level = 0.95)
mean_with_confints(dependent.var, independent.var = NULL, design, confidence_level = 0.95)
dependent.var |
string with the column name in 'data' of the dependent variable. Should be a numerical variable. |
independent.var |
should be null ! For other functions: string with the column name in 'data' of the independent variable |
design |
the svy design object created using map_to_design or directly with svydesign |
confidence_level |
the confidence level to be used for confidence intervals (default: 0.95) |
This function takes the design object and the name of your dependent variable when the latter is a numerical. It calculates the weighted mean for your variable.
A table in long format of the results, with the column names dependent.var, dependent.var.value (=NA), independent.var (= NA), independent.var.value (= NA), numbers (= mean), se, min and max.
Weighted means with confidence intervals for groups
mean_with_confints_groups(dependent.var, independent.var, design, confidence_level = 0.95)
mean_with_confints_groups(dependent.var, independent.var, design, confidence_level = 0.95)
dependent.var |
string with the column name in 'data' of the dependent variable. Should be a numerical variable. |
independent.var |
string with the column name in ‘data' of the independent (group) variable. Should be a ’select one' |
design |
the svy design object created using map_to_design or directly with svydesign |
confidence_level |
the confidence level to be used for confidence intervals (default: 0.95) |
This function takes the design object and the name of your dependent variable when the latter is a numerical. It calculates the weighted mean for your variable.
A table in long format of the results, with the column names dependent.var, dependent.var.value (=NA), independent.var, independent.var.value, numbers (= mean), se, min and max.
Weighted percentages with confidence intervals for select multiple questions
percent_with_confints_select_multiple(dependent.var, dependent.var.sm.cols, design, na.rm = TRUE, confidence_level = 0.95)
percent_with_confints_select_multiple(dependent.var, dependent.var.sm.cols, design, na.rm = TRUE, confidence_level = 0.95)
dependent.var |
string with the column name in ‘data' of the dependent variable. Should be a ’select multiple. |
dependent.var.sm.cols |
a vector with the columns indices of the choices for the select multiple question. Can be obtained by calling choices_for_select_multiple(question.name, data) |
design |
the svy design object created using map_to_design or directly with svydesign |
confidence_level |
the confidence level to be used for confidence intervals (default: 0.95) |
this function takes the design object and the name of your dependent variable when this one is a select multiple. It calculates the weighted percentage for each category.
A table in long format of the results, with the column names dependent.var, dependent.var.value, independent.var (= NA), independent.var.value (= NA), numbers, se, min and max.
Weighted percentages with confidence intervals for groups (select multiple questions)
percent_with_confints_select_multiple_groups(dependent.var, dependent.var.sm.cols, independent.var, design, na.rm = TRUE, confidence_level = 0.95)
percent_with_confints_select_multiple_groups(dependent.var, dependent.var.sm.cols, independent.var, design, na.rm = TRUE, confidence_level = 0.95)
dependent.var |
string with the column name in ‘data' of the dependent variable. Should be a ’select multiple. |
dependent.var.sm.cols |
a vector with the columns indices of the choices for the select multiple question. Can be obtained by calling choices_for_Select_multiple(question.name, data) |
independent.var |
string with the column name in ‘data' of the independent (group) variable. Should be a ’select one' |
design |
the svy design object created using map_to_design or directly with svydesign |
confidence_level |
the confidence level to be used for confidence intervals (default: 0.95) |
this function takes the design object and the name of your dependent variable when this one is a select multiple. It calculates the weighted percentage for each category.
A table in long format of the results, with the column names dependent.var, dependent.var.value, independent.var (= NA), independent.var.value (= NA), numbers, se, min and max.
Weighted percentages with confidence intervals
percent_with_confints_select_one(dependent.var, independent.var = NULL, design, na.rm = TRUE, confidence_level = 0.95)
percent_with_confints_select_one(dependent.var, independent.var = NULL, design, na.rm = TRUE, confidence_level = 0.95)
dependent.var |
string with the column name in ‘data' of the dependent variable. Should be a ’select one' |
independent.var |
should be null ! For other functions: string with the column name in 'data' of the independent variable |
design |
the svy design object created using map_to_design or directly with svydesign |
confidence_level |
the confidence level to be used for confidence intervals (default: 0.95) |
this function takes the design object and the name of your dependent variable when this one is a select one. It calculates the weighted percentage for each category.
A table in long format of the results, with the column names dependent.var, dependent.var.value, independent.var, independent.var.value, numbers, se, min and max.
## Not run: percent_with_confints_select_one("population_group", design)
## Not run: percent_with_confints_select_one("population_group", design)
Weighted percentages with confidence intervals for groups
percent_with_confints_select_one_groups(dependent.var, independent.var, design, na.rm = TRUE, confidence_level = 0.95)
percent_with_confints_select_one_groups(dependent.var, independent.var, design, na.rm = TRUE, confidence_level = 0.95)
dependent.var |
string with the column name in ‘data' of the dependent variable. Should be a ’select one' |
independent.var |
string with the column name in ‘data' of the independent (group) variable. Should be a ’select one' |
design |
the svy design object created using map_to_design or directly with svydesign |
confidence_level |
the confidence level to be used for confidence intervals (default: 0.95) |
this function takes the design object and the name of your dependent variable when this one is a select one. It calculates the weighted percentage for each category in each group of the independent variable.
A table in long format of the results, with the column names dependent.var, dependent.var.value, independent.var, independent.var.value, numbers, se, min and max.
## Not run: percent_with_confints_select_one_groups("population_group", "resp_gender", design)
## Not run: percent_with_confints_select_one_groups("population_group", "resp_gender", design)
not used
reach_style_barchart(group, percent, error_min = NULL, error_max = NULL, horizontal = T)
reach_style_barchart(group, percent, error_min = NULL, error_max = NULL, horizontal = T)
reach brand beiges
reach_style_color_beige(lightness = 1)
reach_style_color_beige(lightness = 1)
Reach brand beige triples
reach_style_color_beiges()
reach_style_color_beiges()
Reach brand dark greys
reach_style_color_darkgrey(lightness = 1)
reach_style_color_darkgrey(lightness = 1)
Reach brand dark grey triples
reach_style_color_darkgreys()
reach_style_color_darkgreys()
reach brand light greys
reach_style_color_lightgrey(lightness = 1)
reach_style_color_lightgrey(lightness = 1)
Reach brand light greys triples
reach_style_color_lightgreys()
reach_style_color_lightgreys()
Reach brand reds
reach_style_color_red(lightness = 1)
reach_style_color_red(lightness = 1)
Reach brand reds triples
reach_style_color_reds()
reach_style_color_reds()
loading function with automatic default
read.csv.auto.sep(file, stringsAsFactors = F, ...)
read.csv.auto.sep(file, stringsAsFactors = F, ...)
file |
path to a csv file with the assessment data |
the file is loaded with stringsAsFactors = F and with column names in alphanumeric lowercase
the data from the csv files as data frame. Column header symbols are changed to lowercase alphanumeric and underscore; everything else is converted to a "."
Rmarkdown from resultlist in specified hierarchical order
resultlist_recursive_markdown(resultlist, by_analysisplan_columns = c("dependent.var"), by_prefix = c("", "subset:", "variable:"), level = 2, render_result_with, questionnaire = NULL, label_varnames = TRUE)
resultlist_recursive_markdown(resultlist, by_analysisplan_columns = c("dependent.var"), by_prefix = c("", "subset:", "variable:"), level = 2, render_result_with, questionnaire = NULL, label_varnames = TRUE)
resultlist |
structure like the output from from_analysisplan_map_to_output: A list with two items "analysisplan" and "results": The "analysisplan" as a data frame, where each row must match a result in a list of "results" |
by_analysisplan_columns |
vector of strings matching column names of the analysisplan. The first element becomes the main heading, the second element the sub-heading etc. |
by_prefix |
a prefix added at the beginnig of the headline; same length as 'by_analysisplan_columns' |
level |
the markdown header level to start with; defaults to 2 which leads to "## heading", i.e. the second header level. |
render_result_with |
a function that takes a single result as input and returns an rmarkdown formated string |
questionnaire |
optional; the questionnaire (koboquest::load_questionnaire()) |
label_varnames |
wether variables names should be labeled in headings |
subset a list of results based on analysis parameters
subset a list of results based on analysis parameters
results_subset(results, repeat.vars = NULL, repeat.var.values = NULL, dependent.vars = NULL, logical = NULL) results_subset(results, repeat.vars = NULL, repeat.var.values = NULL, dependent.vars = NULL, logical = NULL)
results_subset(results, repeat.vars = NULL, repeat.var.values = NULL, dependent.vars = NULL, logical = NULL) results_subset(results, repeat.vars = NULL, repeat.var.values = NULL, dependent.vars = NULL, logical = NULL)
results |
list of results (output from 'from_analysisplan_map_to_output()') |
repeat.vars |
optional: vector of character strings: keeps only results where repeat.var in this list |
repeat.var.values |
optional: vector of character strings: keeps only results where repeat.var.vaues in this list |
dependent.vars |
optional: vector of character strings: keeps only results where dependent.var in this list |
logical |
optional: subset by a logical vector (same length as list of results) |
results |
list of results (output from 'from_analysisplan_map_to_output()') |
repeat.vars |
optional: vector of character strings: keeps only results where repeat.var in this list |
repeat.var.values |
optional: vector of character strings: keeps only results where repeat.var.vaues in this list |
dependent.vars |
optional: vector of character strings: keeps only results where dependent.var in this list |
logical |
optional: subset by a logical vector (same length as list of results) |
if multiple parameters are given to subset by, only those are kept where all conditions apply
if multiple parameters are given to subset by, only those are kept where all conditions apply
a resultlist in same format as from_analysisplan_map_to_output() only including those results with matching analysis parameters
a resultlist in same format as from_analysisplan_map_to_output() only including those results with matching analysis parameters
Weighted means with confidence intervals for groups
summary_statistic_mode_select_one(dependent.var, independent.var, design, confidence_level = 0.95)
summary_statistic_mode_select_one(dependent.var, independent.var, design, confidence_level = 0.95)
dependent.var |
string with the column name in 'data' of the dependent variable. Should be a select_one or a select_multiple. |
independent.var |
string with the column name in ‘data' of the independent (group) variable. Should be a ’select one' |
design |
the svy design object created using map_to_design or directly with svydesign |
confidence_level |
the confidence level to be used for confidence intervals (default: 0.95) |
This function takes the design object and the name of your dependent variable, and returns the most frequent answer for each category in independent.var
A table in long format of the results, with the column names dependent.var, dependent.var.value (=NA), independent.var, independent.var.value, numbers (= mean), se, min and max.