A high-velocity, structured inference and evaluation pipeline for generating legally robust patent claims using Large Language Models. Built specifically to demonstrate production-grade AI engineering, strict data typing, and automated LLM R&D benchmarking.
- Advanced Prompt Engineering: Utilizes Chain-of-Thought (CoT) reasoning to deterministically generate independent and dependent patent claims from complex technical disclosures.
- Strict Data Typing (Full-Stack Ready): Enforces rigid data structures using
pydantic. All LLM outputs are deterministically cast into schemas, ensuring seamless ingestion into downstream Postgres databases or React/TypeScript frontends. - SOTA Evaluation Pipeline (LLM-as-a-Judge): Implements an automated benchmarking loop that scores generated IP on Novelty, Clarity, and Enablement, completely removing human-in-the-loop bottlenecks for R&D iteration.
To run this pipeline, you must have Python 3.10+ installed. The project relies on the following exact dependencies to ensure deterministic execution and schema validation. Save these in a requirements.txt file:
openai==1.14.2 pydantic==2.6.4
-
Clone the Repository: git clone https://github.com/your-username/auto-ip-evaluator.git cd auto-ip-evaluator
-
Install Dependencies: (It is highly recommended to use a virtual environment) python -m venv venv source venv/bin/activate # On Windows use
venv\Scripts\activatepip install -r requirements.txt -
Environment Setup: You must provide an OpenAI API key for the generation and evaluation loops to function. export OPENAI_API_KEY="your_api_key_here"
The pipeline is pre-configured to run against a mock dataset of 5 highly technical edge-case disclosures located in data/mock_inventions.json.
Run the main orchestrator to watch the generation and evaluation loop execute in real-time:
python main.py
The orchestrator handles both the structured generation and the automated critique, outputting a benchmarked report for each invention:
--- Processing: INV-003 | Predictive DAG Scheduler for Distributed Inference --- [+] Generating Claims via GPT-4-Turbo... [+] Successfully generated 3 structured claims. [+] Running SOTA 'LLM-as-a-Judge' Evaluation...
📊 Benchmarking Results: Novelty: 9.0/10 Clarity: 9.5/10 Enablement: 8.5/10
Critique: The independent claim successfully isolates the predictive cycle detection as a novel mechanism. The dependency structure is legally sound and unambiguous.