santifer/career-ops

Fallback URL search for tracked companies on 410/404

Open

#266 opened on Apr 13, 2026

View on GitHub
 (2 comments) (0 reactions) (0 assignees)JavaScript (44,756 stars) (9,398 forks)batch import
enhancementgood first issuehelp wanted

Description

Code of Conduct

Existing issues

  • I searched existing issues and this hasn't been requested yet

Problem

When Playwright gets a 410/404 on a specific job URL for a tracked_company, the current behaviour marks it as skipped_expired immediately. But companies frequently move roles to new URLs (especially on Workday/Greenhouse) without closing them. This produces false negatives - active roles getting dropped from the pipeline.

Scoped to tracked_companies only: These are companies we're actively monitoring. The extra search is justified. For broad discovery results, the cost/signal ratio doesn't warrant it.

Proposed solution

If a URL returns 410/404 AND the company is in tracked_companies: before marking expired, run one WebSearch: "{role title}" "{company}" site:{careers_url_domain}. If the role appears, use the new URL. If not, then mark as skipped_expired. This fallback should NOT apply to roles found via Level 3 broad discovery — for those, accept 410 as expired directly.

Area

Portal Scanner

Contributor guide