Google Ads Transparency Scraper: pull any competitor's ads for $1.20/1K
Quick answer: The Google Ads Transparency Center is a public registry of every ad Google runs โ but it ships no API and no bulk export. To get the data programmatically you scrape it. A Google Ads Transparency scraper sends the same RPC call the website uses and returns every ad creative for an adve

Quick answer: The Google Ads Transparency Center is a public registry of every ad Google runs โ but it ships no API and no bulk export. To get the data programmatically you scrape it. A Google Ads Transparency scraper sends the same RPC call the website uses and returns every ad creative for an advertiser as structured JSON. The Apify Actor below does it for $0.0012 per ad (~$1.20 per 1,000), with the TLS fingerprinting, proxy rotation, and pagination handled for you. Google's Ads Transparency Center is one of the most underused datasets in marketing. Launched in 2023 under the EU Digital Services Act and parallel US pressure, it indexes every ad campaign currently running on Search, YouTube, Display, Shopping, Maps, and Play โ keyed by advertiser. Google's own counter lists 300,000+ active creatives for a brand like Nike. For your nearest competitor, it's usually 50โ500. The catch: there's no download button. Just an interactive UI that paginates 40 creatives at a time. If you want this as a CSV โ for a competitor sweep, a trademark audit, or a RAG corpus โ you have to extract it yourself. Here's what that actually takes, and how I shortened it to one API call. The Google Ads Transparency Center is a public, Google-operated registry that shows the ad creatives any verified advertiser is running, the date range each ad was shown, and roughly where. Google built it to comply with ad-disclosure regulation, so the data is public by design โ you're reading the same registry a regulator would. What it gives you per advertiser: Every ad creative currently or recently live (text, image, video) The landing domain each ad clicks through to First-shown / last-shown timestamps and a rough impression count A deep link to each creative inside the Transparency Center What it does not give you: a search-by-keyword mode, region-filtered results from the server, or โ crucially โ an API. No. As of 2026 Google publishes no official API or bulk export for the Ads Transparency Center. The only programmatic surface is the internal SearchService/SearchCreatives RPC that the website itself calls. That endpoint is undocumented, returns a positional protobuf-style array (not labeled JSON), and inspects your TLS fingerprint before it answers. Scraping it reliably is the whole job โ which is why a hosted Actor exists instead of a three-line snippet. Each ad creative comes back as one flat, typed row. Concrete beats abstract, so here's a real one: { "advertiser_id": "AR18378488041124659201", "advertiser_name": "Nike Retail BV", "creative_id": "CR15771942603307614209", "creative_url": "https://adstransparency.google.com/advertiser/AR18378488041124659201/creative/CR15771942603307614209?region=anywhere", "landing_domain": "nike.com", "format_type": 1, "first_shown_ts": 1761145807, "last_shown_ts": 1778871417, "impressions": 205, "preview_image_url": "https://tpc.googlesyndication.com/archive/simgad/12774179880874022668", "preview_content_js_url": null, "region": "anywhere", "scraped_at": "2026-05-15T19:17:59+00:00" } Thirteen fields, the same shape every time, validated with Pydantic before it's written. It drops straight into Pandas, BigQuery, or a vector store โ no positional-array wrangling on your side. The first thing every scraper-aware person tries: Open Chrome DevTools, find the XHR call to SearchCreatives Replay it with requests.post() Parse the JSON, paginate, done It breaks on the first request. Three reasons, and they're the reasons a hosted Actor earns its keep: 1. TLS fingerprinting. Google's endpoint inspects the JA3/JA4 signature of your TLS handshake. Python's stdlib SSL doesn't match any real browser, so the server returns 403 before it even reads your payload. We get around it by impersonating a real Firefox 147 TLS + HTTP/2 fingerprint via curl-cffi โ so the handshake looks like a browser, because functionally it is one. 2. Cookie continuity across pagination. The pagination cursor is bound to a session cookie. Rotate IPs naively between pages and the server invalidates your cursor mid-scrape. We thread Apify residential proxies with sticky sessions so each advertiser's pagination keeps one stable exit IP and cookie jar, and we pace requests at ~1/sec to stay polite. 3. A positional, protobuf-flavored response. The reply isn't keyed JSON โ it's nested arrays where meaning depends on position. One Google A/B rotation and a naive parser silently emits garbage. We pin the parser against four captured creative shapes (still image, rich video, minimal, malformed) and run live wire-validation to catch contract drift before it reaches your dataset. On 408/429/5xx we retry with exponential backoff and fail loud on partial success rather than handing you a half-empty file. None of that is glamorous. All of it is the difference between a script that worked once on your laptop and a feed that survives Google's quarterly cipher rotation. I packaged the result as an Apify Actor: Google Ads Transparency Scraper. Paste a domain in the Apify Console and click Start, or run it programmatically: from apify_client import ApifyClient client = ApifyClient("APIFY_TOKEN") run = client.actor("DevilScrapes/google-ads-transparency").call( run_input={ "searchDomains": ["nike.com", "adidas.com"], "maxResults": 5000, } ) for item in client.dataset(run["defaultDatasetId"]).iterate_items(): print(item) You search by landing domain (returns every ad pointing at that domain โ including ones bought by resellers and affiliates) or by advertiser ID when you already know the exact advertiser. Multiple targets per run, deduplicated automatically. Four concrete patterns, not generic "competitive intelligence": Weekly competitor sweep. Schedule a run on your top 5 competitors, diff this week's creative IDs against last week's, and alert when a new product line launches. Five competitors ร ~200 ads each = roughly $1.20/week of data. Trademark enforcement. Sweep your own domain and you'll see ads other people bought against your brand keyword โ resellers, affiliates, competitors. Cross-reference advertiser IDs against your trademark portfolio and flag the unlicensed ones. Affiliate-fraud detection. Pull every advertiser whose landing_domain doesn't match the advertiser_name. Mismatches are common in crypto, nutra, and supplement verticals: [c for c in creatives if c["landing_domain"] not in c["advertiser_name"].lower()]. AI / RAG ingestion. Feed creative metadata plus image URLs into a vector store for image-grounded competitive analysis. Pay-per-event. You pay for ads you get, nothing for ads you ask for. No data, no charge. $0.005 per run (covers warm-up + cookie handshake) $0.0012 per ad written to the dataset Pull Cost 100 ads $0.13 1,000 ads $1.21 10,000 ads $12.01 100,000 ads (monthly sweep) $120.05 Apify's $5 free trial credit covers your first ~4,000 ads with no credit card. For comparison, the nearest SaaS substitutes (Adbeat, SpyFu) start around $249/month for a slice of the same Google data. Region filtering doesn't work โ and we say so. The region parameter on Google's SearchCreatives RPC is server-ignored. We tested every plausible request-body shape and none of them returned a region-narrowed result set; the browser UI shows a region selector, but the server hands back the same creative set regardless of what you pass. So we expose region only as a metadata tag โ useful for labeling exports by intended market when you run parallel campaigns, useless as a filter. No public Actor offers real region-narrowed scraping, because Google's endpoint doesn't support it. We'd rather under-promise than ship a filter that silently does nothing. No keyword search. You search by advertiser/domain, not by ad copy โ Google's RPC exposes no keyword mode. Video creatives return a JS bundle, not an MP4. You get a preview_content_js_url; rendering the actual frame needs a headless browser and is out of scope for v1. ~12 months of history. Google purges older creatives, so a wider date range just clips to what they retain. Big brands hit a cap. Google stops paginating past ~1,000 ads per query, so full-history pulls on a Nike-sized advertiser need maxPages raised deliberately. Is scraping the Google Ads Transparency Center legal? How is this different from the Facebook Ad Library? Can I export to Google Sheets or a warehouse? ACTOR.RUN.SUCCEEDED into Make/Zapier/n8n, or pull it via the Apify API. Why are some preview_image_url values null? content.js URL instead of a static image. The Actor is on the Apify Store: apify.com/DevilScrapes/google-ads-transparency. Free $5 trial credit, no credit card. Run it on nike.com and you'll have ~1,000 creatives in your dataset in under a minute. Find a use case I missed, or a field you wish it returned? Drop it in the comments โ I ship based on what people actually need. Built by Devil Scrapes โ Apify Actors with attitude. Pay-per-event, transparent pricing, no junk fields. ๐
Key Takeaways
- โขQuick answer: The Google Ads Transparency Center is a public registry of every ad Google runs โ but it ships no API and no bulk export
- โขThis story was reported by Dev.to, covering developments in the dev space.
- โขAI advancements continue to reshape industries โ read the full article on Dev.to for complete coverage.
๐ Continue reading the full article:
Read Full Article on Dev.to โShare this article



