This repository contains a curated set of logical, mathematical, and reasoning-based questions designed to evaluate the accuracy and reasoning capabilities of AI language models (LLMs). These questions span a variety of topics including logical puzzles, numerical comparisons, calculations, and general knowledge. The goal is to provide a standardized set of challenges for testing and benchmarking AI models.
The purpose of this repository is to:
- Assess the reasoning accuracy of AI LLMs.
- Identify strengths and weaknesses in logical and mathematical understanding.
- Share a structured set of test cases with the community for reproducible evaluation.
- How many months have 28 days?
- Which is greater: the square root of 16 or the cube root of 27?
- If a man boils 2 eggs in 1 minute, how much time will it take to boil 10 eggs?
- Which number is greater: 9.9 or 9.11?
- Which number is greater: 3.14 or π?
- Name a country whose name ends with 'lia' and tell me its capital city.
- What is the capital of France?
- What is the largest planet in our solar system?
- Solve this question: 8 + (6 × 2) − (3 + 5) ÷ 4
- If a car travels 60 miles per hour, how far will it travel in 2 hours?
- If a clock shows 3:15, what is the angle between the hour hand and minute hand?
- How many r's are in "strawberry"?
- What number rhymes with the word used to describe a tall plant?
- How many letters are there in the word "Mississippi"?
- Which mission was launched to explore the outer planets and study planetary atmospheres, moons, and interstellar space?
- What is the speed of Voyager 1?
- What is the speed of Voyager 2?
- How many galaxies are there in the universe?
- Clone the repository:
[git clone https://github.com/yourusername/ai-model-testing.git](https://github.com/thehsansaeed/Questions-for-AI-Model-Testing.git)
- Use these questions to prompt AI models and record their responses.
- Analyze the responses to evaluate the accuracy and reasoning capabilities of the models.
We welcome contributions! If you have additional questions that can challenge AI models or improve this repository, feel free to submit a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.
For any questions or suggestions, please contact Ahsan Saeed at Linkedin