Thin wrapper around jobspy.scrape_jobs() for generating a daily list of fresh
job-posting leads across configured tech verticals and locations. The posting's
company is the lead — a company actively hiring is a prospect for your
recruitment services.
daily_leads.py— runner scriptleads_config.yaml— edit this to change verticals, locations, sites, etc.
From the repo root:
pip install -e .
pip install pyyaml
python examples/daily_leads.pyOutputs:
leads/leads_YYYY-MM-DD.csv— today's new leadsleads/seen_jobs.csv— running history of every job URL ever emitted; used to dedupe subsequent runs
Run it a second time immediately — the summary should report 0 new leads,
confirming dedupe works.
Linux/macOS cron (06:00 local):
0 6 * * * cd /path/to/JobSpy && /usr/bin/python examples/daily_leads.py >> leads/run.log 2>&1Windows: use Task Scheduler with the same command.
- Verticals / locations: edit
leads_config.yaml. Start small (1-2 verticals x 1-2 locations) to sanity-check, then expand. - results_wanted: per board, per query. 50 is a good starting point.
- hours_old: keep at 24 for a daily schedule. Increase if you run less
often. Note LinkedIn/Indeed restrict combining
hours_oldwith some other filters — the script avoids those combinations by default. - Proxies: if you hit rate limits, add a
proxieskey to the config and wire it through toscrape_jobs(proxies=...)indaily_leads.py.
The CSV leads with recruitment-relevant fields: site, company, title, location, job_url, company_url, date_posted, job_type, is_remote, min_amount, max_amount, currency, interval, description, search_term, search_location.
All other JobSpy fields are retained after these.