Create a Stacked Bar Chart
create_stackedbar.RdThis function creates a stacked barchart for survey data. It handles raw (unaggregated) data, counting the occurrences of categories, supporting ordered factors, allowing numerical x-axis and stacked variables to be binned into custom groups, and enables renaming of categorical values for display. It can also handle SPSS (.sav) columns automatically.
Usage
create_stackedbar(
data,
x_var,
y_var = NULL,
stack_var,
title = NULL,
subtitle = NULL,
x_label = NULL,
y_label = NULL,
stack_label = NULL,
stacked_type = c("counts", "percent"),
tooltip_prefix = "",
tooltip_suffix = "",
x_tooltip_suffix = "",
color_palette = NULL,
stack_order = NULL,
x_order = NULL,
include_na = FALSE,
na_label_x = "(Missing)",
na_label_stack = "(Missing)",
x_breaks = NULL,
x_bin_labels = NULL,
x_map_values = NULL,
stack_breaks = NULL,
stack_bin_labels = NULL,
stack_map_values = NULL,
horizontal = FALSE,
weight_var = NULL
)Arguments
- data
A data frame containing the raw survey data (one row per respondent).
- x_var
String. Name of the column for the X-axis categories.
- y_var
Optional string. Name of a pre-computed count column. If NULL (default), the function counts occurrences.
- stack_var
String. Name of the column whose values define the stacks.
- title
Optional string. Main chart title.
- subtitle
Optional string. Chart subtitle.
- x_label
Optional string. X-axis label. Defaults to
x_var.- y_label
Optional string. Y-axis label. Defaults to "Number of Respondents" or "Percentage of Respondents".
- stack_label
Optional string. Title for the stack legend. Defaults to
stack_var.- stacked_type
One of "normal" (counts) or "percent" (100% stacked). Default "normal".
- tooltip_prefix
Optional string prepended to tooltip values.
- tooltip_suffix
Optional string appended to tooltip values.
- x_tooltip_suffix
Optional string appended to x-axis values in tooltips.
- color_palette
Optional character vector of colors for the stacks.
- stack_order
Optional character vector specifying order of
stack_varlevels.- x_order
Optional character vector specifying order of
x_varlevels.- include_na
Logical. If TRUE, NA values in both
x_varandstack_varare shown as explicit categories. If FALSE (default), rows with NA in either variable are excluded. Default FALSE.- na_label_x
String. Label for NA values in
x_varwheninclude_na = TRUE. Default "(Missing)".- na_label_stack
String. Label for NA values in
stack_varwheninclude_na = TRUE. Default "(Missing)".- x_breaks
Optional numeric vector of cut points for binning
x_var.- x_bin_labels
Optional character vector of labels for
x_breaksbins.- x_map_values
Optional named list to remap
x_varvalues for display.- stack_breaks
Optional numeric vector of cut points for binning
stack_var.- stack_bin_labels
Optional character vector of labels for
stack_breaksbins.- stack_map_values
Optional named list to remap
stack_varvalues for display.- horizontal
Logical. If TRUE, creates horizontal bars. Default FALSE.
- weight_var
Optional string. Name of a weight variable to use for weighted aggregation. When provided, counts are replaced with weighted sums using this variable.
Details
This function performs the following steps:
Input Validation: Checks if the provided
datais a data frame and ifx_varandstack_varcolumns exist.Data Copy: Creates a mutable copy of the input
datato perform transformations without affecting the original.Handle 'haven_labelled' Columns: If
havenpackage is available, it detects ifx_varorstack_varare of classhaven_labelled(common for data imported from SPSS/Stata/SAS). If so, it converts them to standard R factors, using their underlying numeric values as levels (e.g., a '1' that was labeled "Male" will become a factor level "1"). This ensuresrecodecan operate correctly.Apply Value Mapping (
x_map_values,stack_map_values): If provided,x_map_valuesandstack_map_values(named lists, e.g.,list("1"="Male")) are used to rename the values inx_varandstack_varrespectively. This is useful for converting numeric codes or abbreviations into descriptive labels. If the column is a factor, it's temporarily converted to character to ensuredplyr::recodeworks reliably on the values.Handle Binning (
x_breaks,x_bin_labels,stack_breaks,stack_bin_labels):If
x_var(orstack_var) is numeric and corresponding_breaksare provided, the function usesbase::cut()to discretize the numeric variable into bins._bin_labelscan be supplied to give custom names to these bins (e.g., "18-24" instead of "(17,25]"). If not provided,cut()generates default labels.A temporary column (e.g.,
.x_var_binned) is created to hold the binned values, and this temporary column is then used for plotting.
Data Aggregation and Final Factor Handling:
The data is transformed using
dplyr::mutateto ensurex_varandstack_var(or their binned versions) are treated as factors. Ifinclude_na = TRUE, missing values are converted into an explicit "(NA)" factor level.If
weight_varis provided, weighted sums are calculated for each combination ofx_varandstack_varusingsum(weight_var, na.rm = TRUE). Otherwise,dplyr::count()is used to count occurrences for each unique combination. This creates thencolumn required forhighcharter.
Apply Custom Ordering (
x_order,stack_order): If provided,x_orderandstack_orderare used to set the display order of the factor levels for the X-axis and stack categories, respectively. This is essential for ordinal scales (e.g., Likert scales) or custom desired sorting. Levels not found in the order vector are appended at the end.Highcharter Chart Generation: The aggregated
plot_datais passed tohighcharter::hchart()to create the base stacked column chart.Chart Customization: Titles, subtitles, axis labels, stacking type (counts vs. percent), data labels, legend titles, tooltips, and custom color palettes are applied based on the function's arguments.
Return Value: The function returns a
highcharterplot object, which can be printed directly to display the interactive chart.