Skip to content

Conversation

@bionascu
Copy link

Updated relevant_pages() in nlp.py to do an intersection query by intersecting the hit lists of the query terms, as described in the textbook. Previously the function was (incorrectly) returning pages that contain the query in its entirety (as a phrase).
Also updated corresponding test. This fixes issue #392

for addr in pagesIndex:
if query_word.lower() in pagesContent[addr].lower():
hit_list.add(addr)
intersection = hit_list if not intersection else intersection.intersection(hit_list)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would lead to problems when the intersection is an empty set. When the set becomes empty it is replaced by hit_list instead of staying empty. I've fixed this and the merge conflict in #509 .

@norvig
Copy link
Collaborator

norvig commented May 31, 2017

Due to @Chipe1 comment about #509, I'm going to close this for now ... could bring back the test later.

@norvig norvig closed this May 31, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants