Skip to contents

Retrieves snapshot data (images, videos, cards, body text, etc.) for a Facebook ad from the Ad Library. Uses headless Chrome via chromote to bypass Facebook's JavaScript-based bot detection.

Usage

get_ad_snapshots(
  ad_id,
  download = FALSE,
  mediadir = "data/media",
  hashing = FALSE,
  wait_sec = 6,
  max_retries = 1
)

Arguments

ad_id

Character string. The Facebook ad ID.

download

Logical. If TRUE, download media files to mediadir.

mediadir

Character. Directory to save downloaded media files.

hashing

Logical. If TRUE, hash downloaded files for deduplication. Recommended for large-scale data collection.

wait_sec

Numeric. Seconds to wait for the page to load (default 6). Increase if you are getting empty results.

max_retries

Integer. Number of retry attempts if data is not found on the first try (default 1). Each retry waits progressively longer.

Value

A tibble with one row containing ad snapshot fields (images, videos, body, title, display_format, page_name, etc.) and the ad ID. If retrieval fails, returns a single-column tibble with just the id.

Details

For best results when processing multiple ads, call browser_session_start() before and browser_session_close() after. If no persistent session exists, a temporary one is created and warmed up automatically (adds ~10 seconds on the first call).

Includes built-in retry logic: if the page loads but snapshot data is not yet available (e.g., JS challenge still completing), the function retries with a longer wait before giving up.

Examples

if (FALSE) { # \dontrun{
# Single ad
snap <- get_ad_snapshots("1536277920797773")

# Batch processing (recommended)
browser_session_start()
results <- map_dfr_progress(ad_ids, ~get_ad_snapshots(.x))
browser_session_close()
} # }