Filtering

`--match=PATTERN` / `--ignore=PATTERN`

Regex applied to every discovered URL before it's fetched. Each pattern has a 1-second timeout to prevent ReDoS.

# Only check internal links
deadfinder sitemap https://www.example.com/sitemap.xml \
  --match='^https://(www\.)?example\.com/'

# Skip media files
deadfinder url https://www.example.com \
  --ignore='\.(png|jpg|gif|webp|mp4)$'

Using both: --match is applied first, then --ignore.

`--include30x`

By default, 3xx redirects are treated as healthy (the destination is what matters). Enable this flag to mark them as dead too:

deadfinder url https://www.example.com --include30x

Use this when your policy is "redirects are technical debt" rather than "follow the redirect chain".

`--limit=N`

Cap the number of URLs scanned per invocation (useful for quick smoke tests of a large sitemap):

deadfinder sitemap https://www.example.com/sitemap.xml --limit=50

Applies to the input list (file lines, STDIN lines, or sitemap <loc> entries). Not to discovered child links on each page.

`--concurrency=N` / `--timeout=N`

Not filters per se, but the other knobs you'll reach for:

--concurrency=50 (default) — number of parallel workers.
--timeout=10 (default, seconds) — per-request connect + read timeout.

Ramp concurrency down on rate-limited targets; up on fast internal scans.

Filtering

--match=PATTERN / --ignore=PATTERN

--include30x

--limit=N

--concurrency=N / --timeout=N

`--match=PATTERN` / `--ignore=PATTERN`

`--include30x`

`--limit=N`

`--concurrency=N` / `--timeout=N`