Skip to content

Conversation

@Chadha93
Copy link
Collaborator

@Chadha93 Chadha93 commented Oct 10, 2025

Summary by cubic

Adds recursive JSON schema support to the Python SDK with OpenAI-compatible normalization/validation and automatic model selection. Improves reliability for extract/jsonOptions when using $ref/$defs.

  • New Features
    • Support recursive schemas ($ref, $defs) across extract and jsonOptions for all v1 methods (sync, async, batch, watch) and v2 validation.
    • Normalize schemas for OpenAI: keep $ref, remove conflicting additionalProperties on objects with properties, and clean invalid required fields.
    • Validate and raise a clear error for schema-less objects (additionalProperties: true without properties).
    • Auto-select model based on schema complexity: gpt-4o for recursive schemas, gpt-4o-mini otherwise; selection added to _model_info for reference.

@Chadha93 Chadha93 force-pushed the chadha93/py-recursion-fix branch from 3087745 to 944bbc3 Compare October 10, 2025 16:02
@Chadha93
Copy link
Collaborator Author

Tested OK:
image

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 3 files

Prompt for AI agents (all 4 issues)

Understand the root cause of the following 4 issues and fix them.


<file name="apps/python-sdk/firecrawl/v2/utils/validation.py">

<violation number="1" location="apps/python-sdk/firecrawl/v2/utils/validation.py:5">
Importing WeakSet from typing raises an ImportError, so this module will fail to load until the import is removed or corrected.</violation>

<violation number="2" location="apps/python-sdk/firecrawl/v2/utils/validation.py:93">
Normalization skips list-valued schema nodes (anyOf/oneOf/items), so conflicting properties inside those arrays are never cleaned, breaking the intended OpenAI normalization for common recursive schemas.</violation>

<violation number="3" location="apps/python-sdk/firecrawl/v2/utils/validation.py:137">
Validation only descends into direct dict values and skips dicts inside arrays like anyOf/oneOf, allowing unsupported schema-less objects to slip through and fail later against OpenAI.</violation>
</file>

<file name="apps/python-sdk/firecrawl/v1/client.py">

<violation number="1" location="apps/python-sdk/firecrawl/v1/client.py:2869">
Removing additionalProperties for every object means callers who require additionalProperties: false lose that constraint, so extra fields slip through. Please only drop the flag when it is explicitly True to avoid regressing strict schemas.</violation>
</file>

React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.

@Chadha93
Copy link
Collaborator Author

@cubic-dev-ai re-review

@cubic-dev-ai
Copy link
Contributor

cubic-dev-ai bot commented Oct 10, 2025

@cubic-dev-ai re-review

@Chadha93 I've started the AI code review. It'll take a few minutes to complete.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 3 files

Prompt for AI agents (all 2 issues)

Understand the root cause of the following 2 issues and fix them.


<file name="apps/python-sdk/firecrawl/v1/client.py">

<violation number="1" location="apps/python-sdk/firecrawl/v1/client.py:2862">
`$defs` entries never go through `_normalize_schema_for_openai`, so recursive definitions retain disallowed flags like `additionalProperties: true`, and recursive schemas remain incompatible with OpenAI.</violation>
</file>

<file name="apps/python-sdk/firecrawl/v2/utils/validation.py">

<violation number="1" location="apps/python-sdk/firecrawl/v2/utils/validation.py:59">
Returning as soon as $defs is present skips the additionalProperties/required cleanup for the current object, so recursive schemas still get sent to OpenAI with the incompatible flags and will be rejected.</violation>
</file>

React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.

@micahstairs micahstairs removed their request for review October 11, 2025 00:14
Chadha93 and others added 3 commits October 22, 2025 17:44
rebase
Signed-off-by: Gaurav Chadha <chadha93@192.168.1.13>
Signed-off-by: Gaurav Chadha <gauravchadha1676@gmail.com>
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
@Chadha93 Chadha93 force-pushed the chadha93/py-recursion-fix branch from 14e3802 to 825dcaf Compare October 22, 2025 12:16
Signed-off-by: Gaurav Chadha <gauravchadha1676@gmail.com>
@Chadha93
Copy link
Collaborator Author

@cubic-dev-ai re-review

@cubic-dev-ai
Copy link
Contributor

cubic-dev-ai bot commented Oct 22, 2025

@cubic-dev-ai re-review

@Chadha93 I've started the AI code review. It'll take a few minutes to complete.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants