fix: Respect enqueue_strategy after redirects in enqueue_links#1607
fix: Respect enqueue_strategy after redirects in enqueue_links#1607vdusek merged 5 commits intoapify:masterfrom
enqueue_strategy after redirects in enqueue_links#1607Conversation
|
Hi, I am not sure what is the desired behavior here. You are overriding the strategy that is in the request. Imagine we have (line from the test you added):
What is the desired behavior in such case? With your change, we will override the explicitly set I am not really sure about the current behavior. I would expect all requests to inherit My point is that in this scenario, there are 3 sources of information for the
|
Excellent point. Thank you.
@janbuchar @vdusek Perhaps you will have some ideas. |
enque_strategy after redirecting for Requests added to the queue using enqueue_links.enque_strategy after redirecting for Requests added to the queue using enqueue_links
vdusek
left a comment
There was a problem hiding this comment.
Just typos, otherwise LGTM
tests/unit/crawlers/_beautifulsoup/test_beautifulsoup_crawler.py
Outdated
Show resolved
Hide resolved
tests/unit/crawlers/_beautifulsoup/test_beautifulsoup_crawler.py
Outdated
Show resolved
Hide resolved
tests/unit/crawlers/_beautifulsoup/test_beautifulsoup_crawler.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Vlada Dusek <v.dusek96@gmail.com>
Co-authored-by: Vlada Dusek <v.dusek96@gmail.com>
Co-authored-by: Vlada Dusek <v.dusek96@gmail.com>
enque_strategy after redirecting for Requests added to the queue using enqueue_linksenqueue_strategy after redirecting for Requests added to the queue using enqueue_links
enqueue_strategy after redirecting for Requests added to the queue using enqueue_linksenqueue_strategy after redirects in enqueue_links
Description
enqueue_strategyattribute inRequestduringenqueue_linksprocessing, for correct check of requests that have completed redirection.Issues
enqueue_linksdoes not set the corresponding strategy in theRequest.enqueue_strategyattribute #1606