To find broken links automatically, run a recurring crawler that checks every internal and external URL on your site against HTTP status codes (404, 410, 500, etc.) and exports the failures to a structured report. The cheapest reliable approach in 2026 is a scheduled cloud actor — under $1 per 1,000 links — instead of a $99/month SEO suite. Below is the exact setup, what status codes to flag, and how to wire it into a weekly cron.
Quick Answer
To find broken links automatically, point a crawler at your sitemap or homepage, configure it to follow links to a depth of 3–5 levels, flag any response in the 4xx or 5xx range, and schedule it to run weekly. Tools like Ahrefs and Sitebulb work but cost $99–$199/month; a pay-per-crawl actor like Dead-Link Watchdog does the same job for $0.0008 per link checked. A 5,000-page site costs $4 per scan, with results exported as JSON, CSV, or XLSX. The output gives you the broken URL, the source page linking to it, and the HTTP status — everything you need to fix or redirect.
Why broken links matter for SEO and UX
Google's crawler treats 404s on internal links as wasted crawl budget. Every dead outbound link drains link equity and signals an unmaintained site. Real numbers from publisher audits:
- Sites with >2% broken internal links see an average 8–12% drop in organic traffic within 90 days.
- 70% of broken external links are caused by the destination changing URL structure, not deletion.
- E-commerce sites lose 5–9% of conversions on category pages with broken product links.
Manual checking does not scale. A 10,000-URL site has roughly 80,000–120,000 outbound link instances. You need automation.
What does "automatically" mean here?
Three components have to be in place:
- Scheduled execution — runs without you clicking anything (daily, weekly, monthly).
- Configurable scope — limits depth, max requests, or specific sections so you don't burn budget on tag archives.
- Structured output — a CSV/JSON file or webhook you can pipe into Slack, Google Sheets, or a Jira ticket.
If a tool needs a human to press "Start scan," it is not automatic.
How do I check all the links on a website at once?
You have three practical paths in 2026:
1. Desktop crawlers (Screaming Frog, Sitebulb)
Fast and detailed, but tied to your machine. You have to open the app and run it manually. Free tier caps at 500 URLs (Screaming Frog).
2. SaaS audits (Ahrefs, Semrush, SE Ranking, Seobility)
Run on a schedule, but the broken-link feature is bundled with $99–$199/month plans. Overkill if all you want is link health.
3. Pay-per-crawl cloud actors
Run on Apify or similar platforms. You pay per link checked — typically $0.0005–$0.001. No subscription. Schedule via cron. This is the cheapest option for sites under 100,000 pages.
Here's how the cost shakes out for a 10,000-link scan run weekly:
| Tool | Monthly Cost |
|---|---|
| Ahrefs Standard | $199 |
| Sitebulb Standard | $35 (plus desktop time) |
| Semrush Pro | $129.95 |
| Dead-Link Watchdog (4 scans × $8) | $32 |
How do I find broken links automatically with an actor?
Using Dead-Link Watchdog on Apify, the setup takes about 5 minutes:
Step 1: Input your start URL or sitemap
{
"startUrls": ["https://yoursite.com/sitemap.xml"],
"maxRequestsPerCrawl": 10000,
"maxDepth": 5
}
Step 2: Pick which status codes to flag The actor lets you toggle these codes individually:
404— Not Found (always flag)410— Gone (always flag)403— Forbidden (often a soft block, optional)500/502/503/504— Server errors (always flag)301/302— Redirects (flag if you want to update internal links to point at final URLs)429— Rate limited (flag if scraping third-party content)
Step 3: Set the schedule
Use Apify Scheduler. A cron like 0 6 * * 1 runs the scan every Monday at 06:00 UTC. Results land in a dataset.
Step 4: Export and route Download as CSV, XLSX, or JSON. Pipe to Google Sheets via the Apify integration, post failures to a Slack channel via webhook, or open the dataset URL directly.
How often should I scan for broken links?
It depends on how fast your content and outbound references change:
- Static brochure site (50–500 pages): monthly is fine.
- Blog or content site (500–5,000 pages): weekly. Posts often link to news articles that 404 within months.
- E-commerce (5,000+ SKUs): daily on category and bestseller pages, weekly full site. Product 404s lose money the moment they appear.
- Affiliate sites: weekly is the minimum. Affiliate URLs and merchant landing pages break constantly.
For a 5,000-link weekly scan: 4 runs × 5,000 links × $0.0008 = $16/month. Compare that to one undetected 404 on a high-traffic page sitting there for 3 weeks.
What's the difference between a broken link and a redirect chain?
A broken link returns 4xx or 5xx — the destination doesn't work. A redirect chain returns 301/302 then eventually 200 — it works, but with extra hops that slow page load and dilute link equity.
Both hurt SEO, but the fixes differ:
- Broken link: remove, replace, or 301-redirect the destination.
- Redirect chain: update the source link to point at the final URL directly.
Dead-Link Watchdog flags both if you enable the 301/302 toggles. The report shows the redirect target so you can search-and-replace across your CMS.
Can I find broken links without code?
Yes. The actor runs entirely from the Apify UI — paste your URL, click Start, download the CSV. No JavaScript, no Python, no headless browser config.
If you do want code, the same actor exposes a REST API:
curl -X POST "https://api.apify.com/v2/acts/dead-link-watchdog/runs?token=YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"startUrls":[{"url":"https://yoursite.com"}],"maxRequestsPerCrawl":5000}'
Run that from a GitHub Action, Zapier, or Make scenario.
How do I fix broken links once I find them?
The export gives you three columns that matter: sourceUrl, brokenUrl, statusCode. Triage in this order:
- Internal 404s — fix first. Either restore the page or 301 the old URL. Use the
sourceUrlto find and update the link. - External 404s on high-traffic pages — replace with archive.org snapshots or equivalent live sources.
- 5xx errors — re-test after 24 hours; transient server issues often resolve themselves.
- 301/302 chains — update source links to point at the final URL. Low priority but easy wins.
- 403s — usually bot-blocking. Verify manually in a browser before "fixing."
Most teams clear a backlog of 200 broken links in 2–3 hours using find-and-replace in their CMS database.
Common pitfalls when automating link checks
- Crawling JS-rendered pages without rendering: if your links are injected client-side, use an actor with headless browser support, not plain HTTP requests.
- Hitting rate limits on external domains: if a site returns 429, that's not your broken link — that's you being too aggressive. Lower concurrency.
- Ignoring soft 404s: pages that return 200 but show "Not found" content. No automated tool catches these reliably; spot-check manually.
- Not setting maxRequestsPerCrawl: runaway crawls on sites with infinite calendar pagination will cost you. Always cap.
FAQ
Q: How much does it cost to scan a 10,000-page site for broken links? At $0.0008 per link checked, a 10,000-link scan costs $8. If each page has 10 outbound links, that's 100,000 link checks for $80 — still cheaper than one month of Ahrefs Standard.
Q: Will an automated broken-link checker hurt my site's performance? Not if you throttle concurrency. Dead-Link Watchdog defaults to a respectful crawl rate. Run scans during off-peak hours (e.g., 04:00 local time) to be safe on shared hosting.
Q: Can I check broken links on a single page only?
Yes — set maxDepth: 0 and pass the page URL as the start URL. The actor will check every outbound link on that one page and stop. Cost: typically under $0.10 per page.
Q: What's better for broken links: Ahrefs or a dedicated actor? Ahrefs is better if you already pay for it and want backlinks, keywords, and audits in one dashboard. A dedicated actor is better if broken links are your only need — it's 80–90% cheaper and runs on your schedule, not theirs.
Q: Does the actor follow nofollow and noindex links? Yes by default — broken is broken regardless of follow attributes, and a 404 on a nofollow link still creates a bad user experience. You can configure it to skip nofollow if you only care about SEO-relevant links.