Instead of determining whether a test should pass or fail based on its file or directory name, all tests could be treated the same way. Each test fails if an expected diagnostic is not reported (false negative) or if a diagnostic is reported without a corresponding comment (false positive).