|
| 1 | +> **⚠️ Disclaimer: Open Analysis** |
| 2 | +> |
| 3 | +> This post explores game data using statistical analysis. Please note that while I am an experienced engineer, |
| 4 | +> I am not a specialized Data Scientist. I have made the code and data available [in GitHub](https://github.com/fezcode/Project-Touch-Grass) for transparency. |
| 5 | +> If you find errors in the methodology or want to improve the model, I welcome your feedback and pull requests. |
| 6 | + |
| 7 | + |
| 8 | +# A Data Autopsy of the 'Casual' Gamer |
| 9 | + |
| 10 | +**Imagine a product.** |
| 11 | + |
| 12 | +It costs $70 a year. You don't just use it; you obsess over it. You spend nearly 15 hours, two full workdays, grinding away at it. |
| 13 | +It demands more of your time than any other form of entertainment you own. |
| 14 | + |
| 15 | +**And you hate almost every minute of it.** |
| 16 | + |
| 17 | +This isn't a theory. This is what the data says. |
| 18 | + |
| 19 | +We tend to dismiss Shooters and Sports games ("Gun and Ball") as the fast food of the gaming industry, soulless, low-quality |
| 20 | +trash consumed by "casuals" who don't know any better. But we don't trust stereotypes. We trust cold, hard numbers. (and that's a fact.) |
| 21 | + |
| 22 | +So, I built a custom data pipeline to ingest over 24,000 video games, filter the noise, and force the industry's biggest genres to face the music. |
| 23 | + |
| 24 | +I expected to find that mainstream games were "dumb fun". Instead, I found a statistical tragedy. |
| 25 | +I found that while Shooter fans are having a blast, Sports fans are stuck in a proven, quantifiable loop of addiction |
| 26 | +and dissatisfaction I call **"The Misery Index"**. |
| 27 | + |
| 28 | +Here is the code, the charts, and the proof. |
| 29 | + |
| 30 | +--- |
| 31 | + |
| 32 | +## 1. The Engineering: Building the Ingestion Engine |
| 33 | + |
| 34 | +I didn't want a static Kaggle CSV snapshot. I wanted a live, reproducible look at the market. |
| 35 | +I built an [ETL](/vocab/etl) (Extract, Transform, Load) pipeline using modern Python tooling to handle the massive amount of data |
| 36 | +required for statistical significance. |
| 37 | + |
| 38 | +### The Stack |
| 39 | + |
| 40 | +* **Environment:** `uv` (for lightning-fast Rust-based dependency management). |
| 41 | +* **Source:** The RAWG.io API (the largest open video game database). |
| 42 | +* **Storage:** Local CSV Data Lake (treating network calls as expensive and disk as cheap). |
| 43 | + |
| 44 | +### The Pipeline Strategy |
| 45 | + |
| 46 | +I designed an ingestion script designed for robustness and respect for API quotas. |
| 47 | + |
| 48 | +1. **Targeting the "Big 8" Archetypes:** Instead of vague genres like "Action," I targeted specific behavioral archetypes: Shooter (Gun), Sports (Ball), RPG (Sword), Strategy (Brain), Indie (Soul), etc. |
| 49 | +2. **Rate-Limited Pagination:** iterated through thousands of pages of API results, implementing sleep timers to handle rate limits gracefully. |
| 50 | +3. **Idempotent Storage:** The system checks against existing records by ID to prevent duplicate entries, allowing me to stop and restart ingestion without corrupting the dataset. |
| 51 | + |
| 52 | +The result was a 3MB raw dataset containing over **24,000 unique games**, ready for analysis. |
| 53 | + |
| 54 | +--- |
| 55 | + |
| 56 | +## 2. The Data Science: Filtering the Noise |
| 57 | + |
| 58 | +Raw data is never ready for insights. My initial ingestion included thousands of unreleased games, |
| 59 | +prototypes, and "shovelware" with zero playtime. Including these would pollute our analysis of actual gamer behavior. |
| 60 | + |
| 61 | +### The "True Gamer" Filter |
| 62 | + |
| 63 | +I applied a filter to create our analysis cohort. Only included games where: |
| 64 | + |
| 65 | +1. `Playtime > 0` (Someone has actually played it). |
| 66 | +2. `Metacritic is not Null` (There is critical consensus on its quality). |
| 67 | + |
| 68 | +This reduced the dataset to the games that actually matter, the ones people are spending their lives on. |
| 69 | + |
| 70 | +### Feature Engineering: The "Misery Index" |
| 71 | + |
| 72 | +Beyond standard metrics like Metascore and Playtime, I needed a way to quantify the relationship between engagement and satisfaction. |
| 73 | + |
| 74 | +So, engineered a new feature: **The Misery Index**. |
| 75 | + |
| 76 | +$$ |
| 77 | +Misery\ Index = |
| 78 | +\begin{cases} |
| 79 | +\frac{Average\ Hours\ Played}{User\ Rating} & \text{if } Rating > 0 \\ |
| 80 | +\text{Excluded (No Data)} & \text{if } Rating = 0 |
| 81 | +\end{cases} |
| 82 | +$$ |
| 83 | + |
| 84 | +* **Low Index:** You play a moderate amount and love it (Healthy). |
| 85 | +* **High Index:** You play a massive amount but rate it poorly (Toxic/Addictive). |
| 86 | + |
| 87 | +> **Note:** *Games with a User Rating of 0 (Unrated) were excluded from the Misery Index to prevent division-by-zero errors. |
| 88 | +> I calculate the index only for games that have both active players and active user sentiment.* |
| 89 | + |
| 90 | +--- |
| 91 | + |
| 92 | +## 3. Findings, Deductions, and Results |
| 93 | + |
| 94 | +I broke my analysis into two distinct phases: |
| 95 | + |
| 96 | +* **Phase A: The Mainstream Myth** (Comparing Shooters & Sports against the rest of the gaming world). |
| 97 | +* **Phase B: The Civil War** (Comparing Shooters directly against Sports). |
| 98 | + |
| 99 | +### Phase A: The "Jock vs. Nerd" Stereotype Is Dead |
| 100 | + |
| 101 | +The data immediately shattered the two biggest myths about mainstream gaming. |
| 102 | + |
| 103 | +#### Myth 1: Critics hate "dumb" mainstream games. |
| 104 | + |
| 105 | +**False**. When I looked at the average critical scores, there was statistically zero difference. Critics judge execution, not genre. |
| 106 | + |
| 107 | +| Cohort | Average Metascore | |
| 108 | +|-----------------------------------|-------------------| |
| 109 | +| **Gun & Ball** | 73.22 | |
| 110 | +| **The Rest (RPGs, Indie, etc.)** | 73.89 | |
| 111 | + |
| 112 | +As the boxplot below shows, the distribution of quality is almost identical. The "quality ceiling" for a great shooter is just as high as a great RPG. |
| 113 | + |
| 114 | + |
| 115 | + |
| 116 | +#### Myth 2: Mainstream gamers are "Casuals." |
| 117 | + |
| 118 | +**False**. This was the biggest shock. "Gun & Ball" players are not casuals; they are the most dedicated grinders in the industry. |
| 119 | + |
| 120 | +| Cohort | Average Hours Played | |
| 121 | +|----------------|----------------------| |
| 122 | +| **Gun & Ball** | **7.88 Hours** | |
| 123 | +| **The Rest** | 4.91 Hours | |
| 124 | + |
| 125 | +Mainstream gamers play nearly **60% longer** per game than fans of narrative genres. The industry pivot to "Live Service" wasn't an accident; it was a response to this data. |
| 126 | + |
| 127 | + |
| 128 | + |
| 129 | +--- |
| 130 | + |
| 131 | +### Phase B: The Civil War (The FIFA Paradox) |
| 132 | + |
| 133 | +When I grouped Shooters and Sports together, they looked healthy. But when I split them apart, a tragic story emerged. They are not the same. |
| 134 | + |
| 135 | +#### The Tale of the Tape |
| 136 | + |
| 137 | +| Metric | Gun (Shooter) | Ball (Sports) | The Winner | |
| 138 | +|------------------------------------|---------------|-----------------|------------------------------| |
| 139 | +| **Popularity** (Avg Ratings Count) | **514** | 117 | **Gun** (Cultural Dominance) | |
| 140 | +| **Quality** (Metascore) | 72.98 | **73.85** | **Ball** (Marginally) | |
| 141 | +| **Addiction** (Avg Playtime) | 5.50 Hours | **14.23 Hours** | **Ball** (Massive Grind) | |
| 142 | +| **Happiness** (User Rating 0-5) | **3.38** | 2.93 | **Gun** (Soul intact) | |
| 143 | + |
| 144 | +#### The Deduction: The Misery Loop |
| 145 | + |
| 146 | +Shooter players act like "Tourists", they come in huge numbers, play a moderate amount (5.5 hours), have fun (3.38 rating), and leave. |
| 147 | + |
| 148 | +Sports players act like "Hostages". They have the highest retention in the industry (**14.23 hours** average) but the lowest satisfaction (**2.93 rating**). |
| 149 | +It is the only major genre with an average user rating below 3.0. |
| 150 | + |
| 151 | +This is visualized perfectly by the **Misery Index**: |
| 152 | + |
| 153 | + |
| 154 | + |
| 155 | +### The "Quadrant of Misery" |
| 156 | + |
| 157 | +To fully understand the scale of this anomaly, I plotted every game in the dataset on a 2D plane. |
| 158 | + |
| 159 | +* **X-Axis (Addiction):** Hours Played. Further right means more grinding. |
| 160 | +* **Y-Axis (Quality):** Metacritic Score. Higher means better critical reception. |
| 161 | + |
| 162 | +Most genres cluster in the top-left (High Quality, Moderate Playtime). |
| 163 | +Great RPGs and Strategy games sit high up, respected but finished in a reasonable time. |
| 164 | + |
| 165 | +But look at the **Orange Xs (Sports Games)**. |
| 166 | + |
| 167 | + |
| 168 | + |
| 169 | +They form a distinct tail stretching into the bottom-right corner. This is the **Quadrant of Misery**. |
| 170 | + |
| 171 | +While other genres usually see a correlation between quality and playtime (people play good games longer), Sports games break the correlation. |
| 172 | +You can see titles with mediocre scores (60–70) that command massive playtime (40+ hours). This visualizes the "captive audience" effect: |
| 173 | +players aren't staying because the game is a masterpiece; they are staying because there is no alternative. |
| 174 | + |
| 175 | +--- |
| 176 | + |
| 177 | +## 4. The Questions We Asked (And The Answers Data Gave) |
| 178 | + |
| 179 | +I started this project with 7 specific questions. Here are the definitive answers based on the data. |
| 180 | + |
| 181 | +**Q1: Do G&B games perform better than other genres (critically)?** |
| 182 | +**No.** As seen in **Figure 1**, they are statistically average. |
| 183 | +They perform roughly the same as RPGs and Strategy games on Metacritic (approx. 73/100). |
| 184 | + |
| 185 | +**Q2: Are G&B players casual players?** |
| 186 | +**Absolutely not**. As shown in **Figure 2**, they play significantly more hours per game than the rest of the gaming population. |
| 187 | +They are "Hardcore Grinders". |
| 188 | + |
| 189 | +**Q3: Are G&B games soulless? (User Rating)** |
| 190 | +**Sports games are; Shooters are not**. |
| 191 | +Shooters maintain a healthy user rating (3.38/5). Sports games have the lowest user satisfaction in the industry (2.93/5). |
| 192 | + |
| 193 | +**Q4: Do G&B games make more money than others?** |
| 194 | +**Yes**. They have significantly higher rating counts (514 vs ~60), indicating massive install bases and sales volume compared to niche genres. |
| 195 | +(Let's not forget about microtransactions, yuck.) |
| 196 | + |
| 197 | +**Q5: Are G&B games better than other genres?** |
| 198 | +**No**. They lose on user satisfaction and tie on critical quality. |
| 199 | +They are not "better" by any qualitative metric; they are simply "stickier". |
| 200 | + |
| 201 | +**Q6: Are G&B games played more than other genres?** |
| 202 | +**Yes, by a wide margin**. They are nearly 60% more effective at retaining player attention than the average game, |
| 203 | +as proven by the **Figure 2** playtime gap. |
| 204 | + |
| 205 | +**Q7: What are common specs of people playing G&B games?** |
| 206 | +**Figure 4** reveals two distinct profiles: |
| 207 | +* The **Shooter** player is a "Grazer", high volume of games, moderate playtime per game (Blue circles in the top-left). |
| 208 | +* The **Sports** player is a "Specialist", low volume of games, massive investment of time into a single annual title they hate (Orange Xs in the bottom-right). |
| 209 | + |
| 210 | +---- |
| 211 | + |
| 212 | +You can check the project here: [Project Touch Grass](https://github.com/fezcode/Project-Touch-Grass) |
0 commit comments