Creating Scatter Plots with viz_scatter()
scatter_vignette.Rmdπ Introduction
The viz_scatter() function visualizes the relationship
between two numeric variables. Each point represents one observation,
positioned by its x and y values. Essential for exploring correlations,
clusters, and outliers.
library(dashboardr)
library(dplyr)
library(gssr)
library(haven)
# Load GSS data - we need numeric variables for scatter plots
data(gss_all)
gss <- gss_all %>%
select(year, age, sex, race, degree, happy, polviews, educ, realinc) %>%
filter(year == 2022, # Use 2022 which has realinc data
!is.na(age), !is.na(educ), !is.na(realinc),
realinc > 0, educ > 0) %>%
mutate(
sex = droplevels(as_factor(sex)),
degree = droplevels(as_factor(degree))
)π Basic Scatter Plots
Create a simple scatter plot showing the relationship between education (years) and income:
plot <- viz_scatter(
data = gss,
x_var = "educ",
y_var = "realinc",
title = "Education vs Income",
x_label = "Years of Education",
y_label = "Real Income ($)"
)
plotπ Adding Trend Lines
Use show_trend = TRUE to add a regression line:
plot <- viz_scatter(
data = gss,
x_var = "educ",
y_var = "realinc",
show_trend = TRUE,
title = "Education vs Income (with trend)"
)
plotπ¨ Coloring by Groups
Use color_var to color points by a categorical
variable:
plot <- viz_scatter(
data = gss,
x_var = "educ",
y_var = "realinc",
color_var = "sex",
title = "Education vs Income by Sex",
color_palette = c("#3498DB", "#E74C3C")
)
plotπ Age vs Education
Another relationship to explore - age and years of education:
plot <- viz_scatter(
data = gss,
x_var = "age",
y_var = "educ",
color_var = "degree",
title = "Age vs Education by Degree",
x_label = "Age (years)",
y_label = "Years of Education",
alpha = 0.5,
color_palette = c("#E74C3C", "#F39C12", "#27AE60", "#3498DB", "#9B59B6")
)
plotποΈ Handling Overlap with Transparency
For dense data, use alpha to reveal patterns:
plot <- viz_scatter(
data = gss,
x_var = "age",
y_var = "realinc",
alpha = 0.3,
point_size = 3,
title = "Age vs Income (with transparency)"
)
plotπ·οΈ Labels and Tooltips
Customize axis labels and tooltip formatting for better readability:
plot <- viz_scatter(
data = gss,
x_var = "educ",
y_var = "realinc",
title = "Education vs Income",
x_label = "Years of Education",
y_label = "Annual Income (USD)",
tooltip_format = "Education: {x} years, Income: ${y}"
)
plot| Parameter | Description | Example |
|---|---|---|
x_label |
Custom x-axis label | "Years of Education" |
y_label |
Custom y-axis label | "Income (USD)" |
tooltip_format |
Custom tooltip template | "x: {x}, y: {y}" |
The tooltip_format parameter supports placeholders:
{x} for x-value, {y} for y-value, and
{color} for the color group.
π Using with create_content()
Integrate scatter plots into dashboards:
content <- create_content(data = gss, type = "scatter") %>%
add_viz(
x_var = "educ",
y_var = "realinc",
show_trend = TRUE,
title = "Education vs Income"
)
content %>% preview()With Filters
Compare relationships across groups:
content <- create_content(data = gss, type = "scatter", alpha = 0.5) %>%
add_viz(
x_var = "educ",
y_var = "realinc",
title = "Male",
filter = ~ sex == "male",
tabgroup = "By Sex"
) %>%
add_viz(
x_var = "educ",
y_var = "realinc",
title = "Female",
filter = ~ sex == "female",
tabgroup = "By Sex"
)
content %>% preview()Multiple Relationships
content <- create_content(data = gss, type = "scatter", alpha = 0.4, show_trend = TRUE) %>%
add_viz(
x_var = "educ",
y_var = "realinc",
title = "Education β Income",
tabgroup = "Relationships"
) %>%
add_viz(
x_var = "age",
y_var = "realinc",
title = "Age β Income",
tabgroup = "Relationships"
)
content %>% preview()π Interpreting Scatter Plots
π‘ When to Use Scatter Plots
Use viz_scatter() when: - Exploring
relationship between two numeric variables - Looking for correlations -
Identifying outliers - Showing individual-level data
Use viz_histogram() when: - Showing
distribution of a single variable
Use viz_heatmap() when: - Data is
aggregated (means, counts) - Many overlapping points
π See Also
-
?viz_scatter- Full function documentation -
vignette("histogram_vignette")- For single-variable distributions -
vignette("heatmap_vignette")- For aggregated relationships -
vignette("content-collections")- For dashboard integration