Skip to content

feat(agent/filefinder): add plocate-lite file finder package#22453

Merged
kylecarbs merged 9 commits intomainfrom
filefinder-v1
Mar 1, 2026
Merged

feat(agent/filefinder): add plocate-lite file finder package#22453
kylecarbs merged 9 commits intomainfrom
filefinder-v1

Conversation

@kylecarbs
Copy link
Copy Markdown
Member

Adds an in-memory trigram-indexed file finder package at agent/filefinder, designed to power a future FindFiles HTTP handler on the WorkspaceAgent.

What it does

Fast fuzzy file search with VS Code-quality matching across millions of files. Sub-millisecond search latency at 100K files.

Architecture

  • Index: append-only docs slice with trigram + prefix posting lists
  • Snapshot: lock-free reader view via frozen slice headers + shallow-copied deleted set
  • Search pipeline: trigram intersection → fuzzy fallback (prefix bucket + subsequence) → brute-force scan (capped at 5K docs)
  • Scoring: subsequence match, basename prefix, boundary hits, contiguous runs, depth/length penalties
  • Engine: multi-root with fsnotify watcher (50ms batch coalescing), atomic snapshot publishing

Benchmarks (10K files)

Query Type Latency
exact_basename (handler.go) ~43µs
short_query (ha) ~7µs
fuzzy_basename (hndlr) ~50µs
path_structured (internal/handler) ~29µs
multi_token (api handler) ~15µs

File inventory (11 files, 3273 lines)

File Lines Purpose
text.go 264 Normalization, trigram extraction, scoring
delta.go 128 Index, Snapshot, CRUD operations
query.go 272 Query planning, search strategies, top-K merge
engine.go 323 Multi-root engine, watcher integration
watcher_fs.go 201 fsnotify wrapper with batch coalescing
*_test.go 2085 Unit tests, integration tests, benchmarks

Coder added 3 commits March 1, 2026 03:16
In-memory trigram index with fuzzy matching for fast file search.

- Index: append-only docs with trigram/prefix posting lists
- Snapshot: lock-free reader view via frozen slice headers
- Search: trigram intersection → fuzzy fallback → brute-force scan
- Scoring: subsequence, basename prefix, boundary hits, contiguous runs
- Engine: multi-root with fsnotify watcher (50ms batch coalescing)
- Benchmarks: ~535µs fuzzy search at 100K files, ~2.4KB/file build cost
- Merge format.go into delta.go, delete format.go
- Extract shared walkRoot helper in engine.go
- Replace hand-rolled sorts with slices.SortFunc
- Merge scoredResult into Result (eliminate redundant struct)
- Replace depth counting loop with strings.Count
- Compact skipDirs map to 3 lines
- Trim all doc/inline comments to minimal form
- Remove unused hashBasename function

Source: 1528 -> 1333 lines (-12.8%)
Tests: unchanged quality (removed only hashBasename test)
Benchmarks: identical performance
…145 lines)

- Merge normalize.go + score.go into text.go
- Remove unused IncludeHidden field from SearchOptions
- Add copyPostings[K] generic helper in delta.go
- Aggressive blank line removal across all source files
- Compact struct literals and single-line returns

Source: 1333 -> 1188 lines (-10.9%)
Total: 3418 -> 3273 lines
Benchmarks: identical performance, race-clean
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 1, 2026


Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


Coder seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

Coder added 6 commits March 1, 2026 03:41
- Merge normalize_test.go + score_test.go into text_test.go
- Merge 9 TestNewQueryPlan_* into single table-driven test
- Extract newTestEngine/requireResultHasPath helpers in engine_test
- Replace buildIndex walk with walkRoot call in bench_test
- Remove unused buildSearchableSnapshot
- Replace uint32SlicesEqual with slices.Equal
- Compact table-driven test cases to single-line syntax
- Remove section dividers, obvious comments, blank lines

Test files: 2085 -> 1402 lines
Total: 3273 -> 2590 lines
All tests pass, race-clean, benchmarks identical
- Move test files to package filefinder_test with export_test.go bridge
- Export candidate/queryPlan struct fields for test access
- Fix gosec G115 (int->uint32 overflow) with nolint annotations
- Fix gosec G306 (file permissions 0o644 -> 0o600)
- Fix forcetypeassert (checked heap type assertions)
- Fix gocritic short log messages (>=16 chars)
- Fix gocritic magic numbers (use testutil.WaitShort/IntervalFast)
- Fix paralleltest (TestMemoryProfile)
- Fix revive unhandled-error (f.Close)
- Fix staticcheck SA6001 (inline string conversion in map lookup)
Critical fixes:
- Fix timer reset bug in watcher batch loop (timer not nil'd after flush)
- Move walkRoot() I/O outside mutex in AddRoot and Rebuild
- Use context.Background() for watcher context instead of caller's ctx

Improvements:
- Add package doc comment
- Add exported ErrClosed sentinel error
- Unexport scoreParams/defaultScoreParams (not part of public API yet)
- Add explanatory comment for DirTokenHit scoring location
- Remove redundant Snapshot.count field (use len(docs))
- Remove unnecessary sc := sc loop capture (Go 1.22+)
- Add doc comments on exported types and key internal functions
…heck in AddRoot

- Capture idx.Len() while holding the lock to prevent race with
  concurrent applyEvents mutating the index.
- Re-check e.closed after re-acquiring the lock in AddRoot to
  prevent adding roots to a closed engine.
The 100K-file memory profile test created real filesystem trees and
took ~5s, dominating the test suite runtime. Convert it to a
BenchmarkMemoryProfile that only runs with -bench, using
b.ReportMetric for bytes/file stats.

Test suite time: 8.6s -> 1.1s
@kylecarbs kylecarbs merged commit 68e4155 into main Mar 1, 2026
24 of 25 checks passed
@kylecarbs kylecarbs deleted the filefinder-v1 branch March 1, 2026 04:37
@github-actions github-actions bot locked and limited conversation to collaborators Mar 1, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant