Steps to replicate the issue (include links if applicable):
- Search for deepcategory:"Audio files of music" -deepcategory:"Audio files of music by genre" (link) in the Wikimedia Commons search
What happens?:
It does not show any search results. (It also shows no error message so the user is left clueless as to why that is but that is a separate issue T376439)
What should have happened instead?:
Deepcategory searches should not fail but show the results up to the extent possible and display in the error message which categories have been trimmed off.
- For example, instead of displaying no files, it would display many / probably most files and an info message in the box like "Deep category query returned too many categories so MIDI files of melody settings by Peter Gerloff and Chill-out music from Free Music Archive have been excluded". This is especially useful for category branches that have just one very deeply nested branch that does not get considered in the search results.
- Maybe I could add some better examples if that helps but you can probably find very large or deeply nested categories in Videos in English for example. The user can then look at / use the results with the awareness that files in the named category/ies have not been included here and the user may also later separately deepcategory search the trimmed off category separately. The full results may have 563 results and it shows only 480 but that is usually far better than showing no results at all and probably often already showing what the user wanted to see since the deeply nested category is too far from the specified cat to still contain very relevant results to it (note that one can go through the results and exclude categories using -deepcategory or -incategory so showing too many results isn't a problem albeit showing the source of categorization per file would be very useful).
Software version (on Special:Version page; skip for WMF-hosted wikis like Wikipedia):
Other information (browser name/version, screenshots, etc.):
By the way some info about which categories are the top categories in the scan by the number of files directly contained in them would also be very useful. This could be useful for all sorts of things, for example to exclude a category that is not of interest but adding many files to the results like e.g. "Videos from studies uploaded with Open Access Media Importer" (about 10 k files) in "Videos of science". It would be great if at some point the current limits could be increased but far more needed is seeing some results instead of no results at all per scan.
It's unclear whether self-categorizations also cause the deepcategory search to fail (if so it could be prevented by comparing the cat title to the titles in the array of already scanned categories or any duplicate items). That may be a separate issue.