Add per-index default search parameters (default_probes, default_ef_search)#933
Open
lossyrob wants to merge 21 commits intopgvector:masterfrom
Open
Add per-index default search parameters (default_probes, default_ef_search)#933lossyrob wants to merge 21 commits intopgvector:masterfrom
lossyrob wants to merge 21 commits intopgvector:masterfrom
Conversation
- Spec.md: Feature specification for per-index default search parameters - SpecResearch.md: Research on GUC source tracking and reloptions - CodeResearch.md: Code analysis of existing pgvector patterns - ImplementationPlan.md: 5-phase implementation plan Implements RFC from pgvector#235
…-option_plan [Search Defaults Index Option] Planning: Implementation plan for per-index search defaults
Phase 1 of implementing per-index default search parameters (Issue pgvector#235). This commit extends the index options to support: - default_probes for IVFFlat indexes - default_ef_search for HNSW indexes Changes: - Added defaultProbes field to IvfflatOptions struct - Added defaultEfSearch field to HnswOptions struct - Registered new reloptions in IvfflatInit() and HnswInit() - Added parsing entries in ivfflatoptions() and hnswoptions() - Implemented getter functions IvfflatGetDefaultProbes() and HnswGetDefaultEfSearch() The new options use 0 as a sentinel value meaning "unset", which allows the existing GUC defaults to take precedence when no index default is specified. This phase establishes the foundation; subsequent phases will add: - GUC source detection for precedence resolution - Scan integration to use effective values - Cost estimation updates - Test coverage
…-option_phase1 [Search Defaults Index Option] Phase 1: Index Option Infrastructure
Add IvfflatGetEffectiveProbes() and HnswGetEffectiveEfSearch() functions that implement the precedence rules for resolving search parameters: 1. Explicit SET command takes precedence (source == PGC_S_SESSION) 2. Index default value if set (> 0) 3. GUC default value These functions use PostgreSQL's find_option() API to detect whether a GUC was explicitly set in the current session, enabling the per-index default to take effect when users haven't explicitly overridden it. Phase 2 of Search Defaults Index Option implementation.
…-option_phase2 [Search Defaults Index Option] Phase 2: GUC Resolution Logic
Update scan functions to use the new effective value resolution functions instead of directly reading GUC variables: - ivfflatbeginscan(): Use IvfflatGetEffectiveProbes(index) instead of ivfflat_probes for determining the number of probes to use - GetScanItems(): Use HnswGetEffectiveEfSearch(index) instead of hnsw_ef_search for the HNSW search layer ef parameter - ResumeScanItems(): Use HnswGetEffectiveEfSearch(index) instead of hnsw_ef_search for the batch size in iterative scans This implements the precedence rules where: 1. Explicit SET command takes precedence 2. Index default value (if set via default_probes/default_ef_search option) 3. GUC default value (ivfflat.probes=1 or hnsw.ef_search=40) Phase 3 of 235-search-defaults-index-option implementation.
…-option_phase3 [Search Defaults Index Option] Phase 3: Scan Integration
- ivfflatcostestimate(): Use IvfflatGetEffectiveProbes(index) instead of directly reading ivfflat_probes GUC, enabling index-specific default probes to influence cost estimates - hnswcostestimate(): Use HnswGetEffectiveEfSearch(index) instead of directly reading hnsw_ef_search GUC, enabling index-specific default ef_search to influence cost estimates This ensures the query planner uses the same effective search parameter values that will be used at scan time, improving plan quality when indexes have per-index search defaults configured. Implementation notes: - Moved index_close() after the effective value function calls to ensure the index relation remains valid when accessing rd_options - All 14 existing regression tests pass
…-option_phase4 [Search Defaults Index Option] Implementation Phase 4: Cost Estimation
Phase 5: Comprehensive test coverage for index search parameter defaults IVFFlat tests (test/sql/ivfflat_vector.sql): - Test CREATE INDEX with default_probes option - Test query using index default when GUC not explicitly SET - Test explicit SET ivfflat.probes overrides index default - Test RESET ivfflat.probes returns to using index default - Test ALTER INDEX changes default_probes value - Test ALTER INDEX RESET removes default_probes - Test default_probes = 0 acts as unset (uses GUC default) - Test invalid values rejected (default_probes = -1) HNSW tests (test/sql/hnsw_vector.sql): - Test CREATE INDEX with default_ef_search option - Test query using index default when GUC not explicitly SET - Test explicit SET hnsw.ef_search overrides index default - Test RESET hnsw.ef_search returns to using index default - Test ALTER INDEX changes default_ef_search value - Test ALTER INDEX RESET removes default_ef_search - Test default_ef_search = 0 acts as unset (uses GUC default) - Test invalid values rejected (default_ef_search = -1) All 14 regression tests pass.
…-option_phase5 [Search Defaults Index Option] Phase 5: Tests
- Create comprehensive Docs.md for the Search Defaults Index Option feature - Update README.md with documentation for new index options - Add CHANGELOG entry for the new feature
…-option_docs [Search Defaults Index Option] Documentation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Search Defaults Index Option
Summary
This PR adds per-index default search parameters (
default_probesfor IVFFlat anddefault_ef_searchfor HNSW) that automatically configure search behavior without requiring session-levelSETcommands. Index defaults take effect when no explicit session setting is active, while still respecting user overrides.Problem Solved
Before this feature, configuring search parameters like the number of IVFFlat probes or HNSW ef_search required:
SET LOCALstatementsThis created complexity when managing multiple tables with different accuracy/performance tradeoffs, partitioned tables with per-partition indexes, or applications sharing database connections.
Solution
Per-index defaults allow administrators to specify optimal search parameters at index creation time. Queries automatically use the appropriate settings based on which index is selected.
Related Issues
Artifacts
Changes Summary
Key Changes
default_probes- specifies default number of probes when no session SET is activedefault_ef_search- specifies default ef_search value when no session SET is activeSET> Index default > GUC defaultFiles Modified
src/ivfflat.h,src/ivfflat.c,src/ivfutils.c,src/ivfscan.c- IVFFlat supportsrc/hnsw.h,src/hnsw.c,src/hnswutils.c,src/hnswscan.c- HNSW supporttest/sql/ivfflat_vector.sql,test/expected/ivfflat_vector.out- IVFFlat teststest/sql/hnsw_vector.sql,test/expected/hnsw_vector.out- HNSW testsREADME.md- User documentationCHANGELOG.md- Release notesUsage Examples
IVFFlat
HNSW
Modify After Creation
Testing
All 14 regression tests pass:
Test Coverage
Acceptance Criteria
default_probes = Nuses N probes during searchdefault_ef_search = Nuses ef_search of N during searchSETin session overrides any index defaultDeployment Considerations
Breaking Changes
None. This is a purely additive feature that preserves full backward compatibility.