HAZOP Assistant

Inspiration

I've worked around process safety engineers, and the one thing that always struck me was how manual HAZOP studies still are. You've got a team of 6–8 senior engineers sitting in a room for 2–4 weeks, staring at P&ID diagrams, going through every single node and asking "what if the flow is too high? what if the pressure drops? what if the temperature reverses?" — systematically, for every piece of equipment.

The methodology itself (IEC 61882) is brilliant — it's saved countless lives since the 1960s. But the execution? It's a $\text{guide_words} \times \text{parameters} \times \text{nodes}$ cartesian product that humans have to evaluate one by one. For a distillation column with 16 nodes, that's potentially:

$$N_{\text{deviations}} = \sum_{i=1}^{n} |G_i| \times |P_i| \approx 16 \times 7 \times 5 = 560 \text{ combinations}$$

Most of those are physically meaningless. "REVERSE Composition" for a pipe? That doesn't even make sense. But someone still has to look at it and cross it out.

I thought — what if Gemini could read the P&ID, and a smart filter could throw out the nonsense before it ever reaches the table?


What It Does

HAZOP Assistant takes a P&ID diagram (the engineering blueprint of a chemical plant), feeds it to Gemini 2.5 Flash, and produces a complete HAZOP study in under 2 minutes. Here's the pipeline:

  1. Upload a P&ID — drag and drop any diagram image (PNG, JPG, PDF)
  2. AI reads the drawing — Gemini Vision identifies every piece of equipment, every instrument tag (TIC-101, PSV-102), every stream, and every control loop
  3. Smart deviation engine — instead of brute-forcing all $|G| \times |P|$ combinations, a deterministic pre-filter generates only the physically credible ones. A pipeline gets {NO Flow, MORE Pressure, LESS Temperature} — not {REVERSE Composition, AS WELL AS Level}
  4. Risk assessment — each deviation gets severity ($S \in [1,5]$) and likelihood ($L \in {A, B, C, D, E}$) scores based on equipment type. Risk is calculated as:

$$\text{Risk} = S \times L_{\text{numeric}}, \quad \text{where } L_A = 5,\; L_B = 4,\; \ldots,\; L_E = 1$$

  1. Interactive worksheet — filter by Critical/High/Medium, edit any cell, see risk colors
  2. Professional reports — Excel with color-coded risk cells, PDF with executive summary and risk matrix
  3. AI chat — ask "what are the top critical risks?" and it answers from your actual data, not a textbook

Performance benchmarks:

  • For a 3-phase separator: 5 nodes, 51 deviations, full report in ~30 seconds.
  • For a distillation system: 16 nodes, ~140 deviations, full report in ~90 seconds.

How I Built It

Backend: Python 3.13 + FastAPI. The analysis endpoint uses Server-Sent Events (SSE) so the frontend shows progress in real-time — you see nodes appearing as Gemini identifies them.

AI Layer: Gemini 2.5 Flash via the google-genai SDK. Two key tricks:

  • response_mime_type: "application/json" forces structured output — no more parsing markdown-wrapped JSON
  • max_output_tokens: 32768 because complex P&IDs produce large JSON that gets silently truncated at 8192

Deviation Engine: Pure Python — no AI involved here. I built a VALID_COMBOS dictionary that maps equipment types to their physically credible guide word / parameter pairs:

VALID_COMBOS = {
    "pipeline": {
        "NO": ["Flow"],
        "MORE": ["Flow", "Pressure", "Temperature"],
        ...
    },
    "separator": {
        "NO": ["Flow", "Level", "Pressure"],
        "MORE": ["Flow", "Level", "Pressure"],
        ...
    },
    "reactor": {
        "NO": ["Flow", "Level", "Pressure", "Temperature"],
        ...
    },
}

This alone cuts the deviation count by 60–70%. Then a blocklist post-filter removes any row where 3+ fields contain generic template phrases like "equipment malfunction affecting X."

Reports: openpyxl for Excel (multi-sheet workbook with risk-colored cells) and reportlab for PDF (executive summary, risk matrix visualization, action items table).

Frontend: React 18 + Vite + Tailwind CSS. The chat uses react-markdown for rendering Gemini's responses with proper bold, bullets, and formatting.

Data Store: In-memory Python singleton — one FirestoreService class that all routes read/write through. Not production-ready, but eliminates the fragmentation bugs I kept hitting when different parts of the app used different data stores.


Challenges I Ran Into

The "0 nodes" Saga

Gemini worked perfectly when I tested it directly in Python. But through uvicorn? Zero nodes, every time. I spent hours debugging this. Turned out the google-genai SDK's async client has event loop conflicts with uvicorn on Windows. The fix was embarrassingly simple: use the sync client wrapped in asyncio.to_thread().

Truncated JSON

Complex P&IDs with 16+ nodes produce JSON responses that exceed 8192 tokens. Gemini silently truncates the output mid-object — the last node gets cut off like {"node_id": "NODE-016", "equipment — and json.loads() fails. I didn't get any error from the API; it just returned incomplete JSON with a 200 OK. Took me a while to figure out I needed max_output_tokens: 32768.

The Cartesian Explosion

My first version generated $7 \times 5 = 35$ deviations per node. For 16 nodes that's 560 rows — and 90% of them said "Equipment malfunction affecting flow" with "Operator training and procedures" as the safeguard. An engineer seeing 560 rows of that will close the tab. The pre-filter + blocklist approach was the breakthrough.

Three Data Stores

The original codebase had _studies_db in one file, _nodes_db and _deviations_db in another, and firestore_service._studies/_nodes/_deviations in a third. Analysis wrote to one, the worksheet read from another, and reports queried the empty third one. Everything looked like it worked until you tried to go from analysis → worksheet → report — then nothing connected. Unifying to a single store fixed everything.

.env on Windows

uvicorn --reload spawns a subprocess that changes the working directory. A relative path .env stops loading. My API key silently became empty. Gemini returned auth errors that got caught by a generic except Exception and fell through to "0 nodes." The fix: Path(__file__).parent / ".env" for an absolute path.


Accomplishments I'm Proud Of

The pre-filter actually works. Going from 189 generic deviations to 51 specific ones — that's not just a number reduction, it's the difference between a report an engineer ignores and one they actually read.

Gemini reads real P&IDs. I fed it a hand-drawn distillation system diagram and it correctly identified T-101 (Feed Tank), C-101 (Distillation Column), V-101 (Reboiler), E-101 (Condenser), V-102 (Reflux Drum), and P-101 A/B (pumps) — with their instrument tags. That's genuinely impressive.

The chat answers from your data. When you ask "what are the top critical risks?", it doesn't give you a Wikipedia article about HAZOP methodology. It says "3-Phase Separator Vessel: NO Pressure (Sev=5, Lik=B)" — because it queries the actual deviations from your study.

End-to-end works. Upload → Analyze → Worksheet → Excel → PDF → Chat. All connected through one data store. No step breaks the next one.


What I Learned

Structured output (response_mime_type) is non-negotiable for production AI apps. Free-text responses wrapped in json` blocks are unreliable — sometimes Gemini adds a preamble, sometimes it doesn't close the block. Forcing JSON mode eliminates an entire class of parsing bugs.

Deterministic logic > AI for systematic tasks. Asking Gemini to generate all HAZOP deviations produces creative but inconsistent output. A lookup table produces the exact right set every time, in milliseconds. Use AI where it shines (vision, language) and code where code is better (combinatorics, filtering).

Token limits are silent killers. The API doesn't error when output is truncated — it just returns what fits. You need to either set a high limit or build truncation recovery.

One data store. Always one data store. The moment you have two dictionaries storing the same kind of data, something will read from the wrong one. I learned this the hard way about five times.

Test through the actual server, not just direct Python. Half my bugs were invisible in direct tests and only appeared through uvicorn's event loop.


What's Next for HAZOP Assistant

The core analysis engine works. To make it production-ready:

  • Cloud Firestore — replace in-memory store so data persists across restarts
  • Cloud Storage — uploaded P&IDs stored in GCS, not local disk
  • Firebase Auth — user accounts, study ownership, team collaboration
  • Google ADK integration — multi-turn agent with function calling for interactive HAZOP sessions (the agent could call identify_nodes, generate_deviations, assess_risks as tools in a conversation)
  • Gemini Live API — real-time voice interface for HAZOP study meetings: "Analyze the next node" spoken aloud while the team discusses
  • Docker + Cloud Run — containerized deployment with auto-scaling
  • AI-powered deviation enrichment — use Gemini to generate specific causes and safeguards per deviation (currently uses knowledge base), but only for the ~50 that pass the pre-filter (not 560)

The goal: a HAZOP study that used to take 2 weeks now takes 2 hours — with the AI handling the systematic part and the engineers focusing on judgment calls.

Built With

  • 20
  • 3.13
  • analysis
  • and
  • asgi
  • client
  • config
  • data
  • excel
  • gemini
  • generation
  • google-genai
  • html/cssnode.js
  • javascript
  • jsx)
  • openpyxl
  • outputgoogle-genai
  • pydantic
  • pydantic-settings
  • python
  • python-multipart
  • react
  • reportlab
  • sdk
  • server)gemini
  • sse-starlette
  • streamingresponse)
  • uvicorn
  • validation
  • vision
Share this project:

Updates