pip install scrapfly-sdk
You can also install extra dependencies
pip install "scrapfly-sdk[seepdup]"for performance improvementpip install "scrapfly-sdk[concurrency]"for concurrency out of the box (asyncio / thread)pip install "scrapfly-sdk[scrapy]"for scrapy integrationpip install "scrapfly-sdk[scrapy]"Everything!
You can create a free account on Scrapfly to get your API Key.
asyncio-pool dependency has been dropped
scrapfly.concurrent_scrape is now an async generator. If the concurrency is None or not defined, the max concurrency allowed by
your current subscription is used.
async for result in scrapfly.concurrent_scrape(concurrency=10, scrape_configs=[ScrapConfig(...), ...]):
print(result)brotli args is deprecated and will be removed in the next minor. There is not benefit in most of case versus gzip regarding and size and use more CPU.
- Better error log
- Async/Improvement for concurrent scrape with asyncio
- Scrapy media pipeline are now supported out of the box