Skip to contents

πŸ“– Introduction

The viz_boxplot() function creates interactive box plots (also known as box-and-whisker plots) using highcharter. Box plots display the five-number summary of a distribution: minimum, first quartile (Q1), median, third quartile (Q3), and maximum, along with outliers.

Box plots are particularly useful for: - Comparing distributions across groups - Identifying outliers - Visualizing the spread and skewness of data

library(dashboardr)
library(dplyr)
library(gssr)

# Load GSS data
data(gss_all)
gss <- gss_all %>%
  select(year, age, sex, race, degree) %>%
  filter(year == max(year, na.rm = TRUE), !is.na(age))

πŸ“Š Basic Box Plot

Create a simple box plot showing the overall distribution of age:

plot <- viz_boxplot(
  data = gss,
  y_var = "age",
  title = "Age Distribution",
  y_label = "Age (years)"
)

plot

πŸ“Š Grouped Box Plots

Compare distributions across categories by adding an x_var:

gss_sex <- gss %>%
  filter(!is.na(sex)) %>%
  mutate(sex = as.character(haven::as_factor(sex)))

plot <- viz_boxplot(
  data = gss_sex,
  y_var = "age",
  x_var = "sex",
  title = "Age Distribution by Sex",
  x_label = "Sex",
  y_label = "Age (years)"
)

plot

πŸŽ“ Box Plot by Education Level

Examine how age varies across education levels:

gss_degree <- gss %>%
  filter(!is.na(degree)) %>%
  mutate(degree = as.character(haven::as_factor(degree)))

plot <- viz_boxplot(
  data = gss_degree,
  y_var = "age",
  x_var = "degree",
  title = "Age Distribution by Education",
  x_label = "Highest Degree",
  y_label = "Age (years)"
)

plot

βš™οΈ Controlling Outlier Display

By default, outliers are shown as individual points. Use show_outliers = FALSE to hide them:

plot <- viz_boxplot(
  data = gss_sex,
  y_var = "age",
  x_var = "sex",
  title = "Age by Sex (No Outliers)",
  show_outliers = FALSE
)

plot

β†”οΈŽοΈ Horizontal Box Plots

Flip the orientation for better readability with many categories:

plot <- viz_boxplot(
  data = gss_degree,
  y_var = "age",
  x_var = "degree",
  title = "Age by Education (Horizontal)",
  horizontal = TRUE
)

plot

🏷️ Custom Category Labels

Use x_map_values to rename category labels:

gss_sex_raw <- gss %>%
  filter(!is.na(sex))

plot <- viz_boxplot(
  data = gss_sex_raw,
  y_var = "age",
  x_var = "sex",
  title = "Age by Sex",
  x_map_values = list("1" = "Male", "2" = "Female")
)

plot

πŸ”’ Custom Category Order

Control the order of categories with x_order:

education_order <- c("graduate", "bachelor", "junior college", "high school", "lt high school")

plot <- viz_boxplot(
  data = gss_degree,
  y_var = "age",
  x_var = "degree",
  title = "Age by Education (Ordered)",
  x_order = education_order
)

plot

🎨 Custom Color Palette

Apply custom colors to the boxes:

plot <- viz_boxplot(
  data = gss_sex,
  y_var = "age",
  x_var = "sex",
  title = "Age by Sex",
  color_palette = c("#3498DB", "#E74C3C")
)

plot

πŸ” Handling Missing Values

Include NA as an explicit category:

gss_with_na <- gss %>%
  mutate(sex_with_na = if_else(row_number() %% 10 == 0, NA_character_, as.character(haven::as_factor(sex))))

plot <- viz_boxplot(
  data = gss_with_na,
  y_var = "age",
  x_var = "sex_with_na",
  title = "Age by Sex (Including Missing)",
  include_na = TRUE,
  na_label = "Not Reported"
)

plot

πŸ“Š Comparing Multiple Groups

Box plots excel at comparing distributions across many groups:

gss_race <- gss %>%
  filter(!is.na(race)) %>%
  mutate(race = as.character(haven::as_factor(race)))

plot <- viz_boxplot(
  data = gss_race,
  y_var = "age",
  x_var = "race",
  title = "Age Distribution by Race",
  x_label = "Race",
  y_label = "Age (years)"
)

plot

πŸ“š Summary

The viz_boxplot() function provides a powerful way to visualize distributions with these key features:

  • Basic boxplot: Just specify data and y_var
  • Grouped comparison: Add x_var to compare across categories
  • Outliers: Control display with show_outliers
  • Orientation: Use horizontal = TRUE for horizontal boxes
  • Labels: Customize with x_map_values and x_order
  • Missing values: Handle with include_na and na_label
  • Styling: Apply custom colors with color_palette