Skip to contents

πŸ“– Introduction

The viz_histogram() function visualizes the distribution of continuous numeric variables. Unlike bar charts (which count categories), histograms show how values are spread across a range by grouping them into bins.

library(dashboardr)
library(dplyr)
library(gssr)
library(haven)

# Load GSS data
data(gss_all)
gss <- gss_all %>%
  select(year, age, sex, race, degree, happy, polviews) %>%
  filter(year == max(year, na.rm = TRUE), !is.na(age))

πŸ“Š Basic Histograms

Create a simple histogram showing the distribution of age:

plot <- viz_histogram(
  data = gss,
  x_var = "age",
  title = "Age Distribution",
  x_label = "Age (years)",
  y_label = "Frequency"
)

plot

βš™οΈ Controlling Bin Size

The bins parameter controls granularity. More bins = more detail but noisier; fewer bins = smoother but less detail.

Few Bins (Smooth)

plot <- viz_histogram(
  data = gss,
  x_var = "age",
  bins = 10,
  title = "Age Distribution (10 bins)"
)

plot

Many Bins (Detailed)

plot <- viz_histogram(
  data = gss,
  x_var = "age",
  bins = 40,
  title = "Age Distribution (40 bins)"
)

plot

πŸ”’ Count vs.Β Percent

By default, histograms show counts. Use histogram_type = "percent" to show percentages:

plot <- viz_histogram(
  data = gss,
  x_var = "age",
  histogram_type = "percent",
  title = "Age Distribution (%)",
  y_label = "Percentage"
)

plot

🎨 Custom Colors

plot <- viz_histogram(
  data = gss,
  x_var = "age",
  bins = 25,
  color_palette = c("#9B59B6"),
  title = "Custom Colored Histogram"
)

plot

πŸ‘οΈ Hiding Data Labels

By default, histograms show count/percentage labels on each bar. Use data_labels_enabled = FALSE to hide them for a cleaner look:

plot <- viz_histogram(
  data = gss,
  x_var = "age",
  bins = 25,
  title = "Histogram Without Data Labels",
  data_labels_enabled = FALSE
)

plot

This is useful when you have many bins or want a simpler visualization.

🏷️ Labels and Tooltips

Customize axis labels and tooltip text for a polished presentation:

plot <- viz_histogram(
  data = gss,
  x_var = "age",
  bins = 25,
  title = "Age Distribution",
  x_label = "Age (years)",
  y_label = "Number of Respondents",
  tooltip_suffix = " people",
  x_tooltip_suffix = " years old"
)

plot
Parameter Description Example
x_label Custom x-axis label "Age (years)"
y_label Custom y-axis label "Frequency", "Percentage"
tooltip_prefix Text before tooltip value "Count: "
tooltip_suffix Text after tooltip value " respondents"
x_tooltip_suffix Text after x value in tooltip " years"

πŸ“ Using with create_content()

Integrate histograms into dashboards using type = "histogram":

content <- create_content(data = gss, type = "histogram") %>%
  add_viz(
    x_var = "age",
    bins = 25,
    title = "Age Distribution"
  )

content %>% preview()
Preview
Age Distribution

Multiple Histograms with Filters

Compare distributions across groups using filters:

content <- create_content(data = gss, type = "histogram", bins = 20) %>%
  add_viz(
    x_var = "age",
    title = "Male",
    filter = ~ sex == "male",
    tabgroup = "By Sex"
  ) %>%
  add_viz(
    x_var = "age",
    title = "Female",
    filter = ~ sex == "female",
    tabgroup = "By Sex"
  )

content %>% preview()
Preview

Multiple Variables

# Prepare data with year as numeric
gss_numeric <- gss %>%
  mutate(year_num = as.numeric(year))

content <- create_content(data = gss_numeric, type = "histogram", bins = 20) %>%
  add_viz(x_var = "age", title = "Age Distribution", tabgroup = "Distributions")

content %>% preview()
Preview
Age Distribution

πŸ” Interpreting Histograms

Distribution Shapes

Shape Meaning Example
Normal (bell) Symmetric around mean Test scores, heights
Right-skewed Long tail to the right Income, response times
Left-skewed Long tail to the left Age at retirement
Bimodal Two peaks Mixed populations
Uniform Flat, equal frequencies Random numbers

What to Look For

  1. Center - Where is the middle of the distribution?
  2. Spread - How wide is the distribution?
  3. Shape - Is it symmetric, skewed, or multimodal?
  4. Outliers - Are there unusual values far from the center?

πŸ’‘ When to Use Histograms

Use viz_histogram() when: - Showing distribution of a continuous variable - Analyzing spread and shape of data - Looking for outliers or unusual patterns

Use viz_bar() when: - Counting categorical values - Comparing groups side-by-side

πŸ“š See Also