website-crawler

Here are 32 public repositories matching this topic...

JustinBeckwith / linkinator

Broken link checker that crawls websites and validates links. Find broken links, dead links, and invalid URLs in websites, documentation, and local files. Perfect for SEO audits and CI/CD.

nodejs testing html link-checker typescript seo web-crawler ci-cd 404 broken-links dead-links website-crawler seo-tools link-validator broken-link-checker url-validator

Updated Nov 4, 2025
TypeScript

X-SLAYER / Website-Cloner

Star

It allows you to download a website from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer.

css html front-end clone js images website-crawler website-clone website-cloner front-end-clone

Updated Jun 1, 2023
Visual Basic .NET

MLArtist / WebScraper

Star

Python-based web crawling script with randomized intervals, user-agent rotation, and proxy server IP rotation to outsmart website bots and prevent blocking.

crawler scraper user-agent scraping beautiful-soup robots-txt beautifulsoup scrapper website-scraper scrapping-python website-crawler beautifulsoup4 crawling-python iprotation

Updated Sep 19, 2025
Python

Sneakpeek is a framework that helps to quickly and conviniently develop scrapers. It’s the best choice for scrapers that have some specific complex scraping logic that needs to be run on a constant basis

python crawler scraper vue scraping crawling python3 scrapers scraper-engine crawlers crawling-framework website-crawler scraping-framework crawler-python scraper-api crawling-engine

Updated Aug 19, 2023
Python

vlmaier / marvel-snap-scrapr

Star

Scraper for https://marvelsnapzone.com to retrieve metadata of Marvel SNAP cards.

game crawler scraper marvel website-scraper website-crawler marvel-characters crawler-python marvel-snap

Updated Jul 1, 2024
Python

sammwyy / SpearCopy

Star

A universal and local phishing toolkit for audit purposes

python web-crawler phishing audit pentesting pentest webscraping pentest-tool website-crawler website-clone phishing-kit phishing-page phishing-script phishing-tool web-clone

Updated Nov 21, 2024
Python

chandrasekharan98 / Multisite-Python-Crawler

Star

An almost generic web crawler built using Scrapy and Python 3.7 to recursively crawl entire websites.

python scrapy-spider python3 scrapy scrapy-crawler scrapy-demo website-crawler crawling-sites recursive-crawling

Updated Mar 1, 2022
Python

martech-engineer / WebKnoGraph

Star

WebKnoGraph is an open research project that uses data processing, vector embeddings, and graph algorithms to optimize internal linking at scale. Built for both academic and industry use, it offers THE FIRST FULLY transparent, AI-driven framework for improving SEO and site navigation through reproducible methods.

marketing-automation world-wide-web network-analysis link-prediction python-development search-engine-optimization website-crawler sentence-embeddings hits-algorithm graph-neural-networks page-rank web-networks real-datasets internal-linking martech-backend synthethic-data-generation

Updated Oct 13, 2025
Jupyter Notebook

yogsec / endpoints-extractor

Star

A powerful Bash script for extracting URLs and API endpoints from HTML, JavaScript, and JSON content of web pages. Designed for security researchers, bug bounty hunters, and developers to streamline endpoint discovery. Simple to use, supports single or multiple URLs, and offers file-saving capabilities.

crawler hackers hacking cybersecurity bug-bounty cyber-security hacking-tool endpoints website-crawler url-finder cybersec bug-bounty-tools cyber-security-tools yogsec endpoints-extractor-tool get-endpoints endpoints-finder url-extract

Updated Apr 7, 2025
Shell

oxylabs / web-scraping-php

Star

A tutorial and code samples of web scraping with PHP

php web-scraping url-scraper screen-scraping website-crawler email-scraper wikipedia-scraper email-scraper-with-proxy

Updated Sep 23, 2025
PHP

JohnScooby / DuckDuckGo-Scraper

Star

A Simple Script To Scrape DuckDuckGo Search Results Using Python And Selenium WebDriver.

python scraper scraping selenium duckduckgo url-scraper google-dorks dork duckduckgo-search website-crawler bing-search dork-scanner dorking dorkscanner bing-dorking dorking-tool

Updated Nov 1, 2022
Python

zebbern / ReconX

Sponsor

Star

🕷️ | ReconX is a Live-Website Crawler made to gather critical information with an option to take a picture of each site crawled!

python search-engine security website crawler information-retrieval osint hacking pentest information-security opsec information-gathering python-crawler website-scraper security-tools website-crawler livedata website-security osint-tool

Updated Feb 20, 2025
Python

Mediashare / crawler

Star

💫 Crawl urls from a webpage and provide a DomCrawler with Scraper Library

crawler scraper crawl website-crawler

Updated Nov 12, 2024
PHP

pratik-paranjape / tarantula-python-crawler

Star

This a project to demonstrate the use of standard python libraries like os, urllib, HTMLParser to create a minimalist webpage crawler that crawls webpages on a website to gather hyperlinks (URLs)

python python3 website-crawler