dedup order_by by value and id#959
Conversation
✅ Deploy Preview for poetic-froyo-8baba7 ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
📝 WalkthroughWalkthroughThe update modifies the In the test suite under ✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
I need to check the errors in tests |
it's just float precision, isn't it? |
|
I ran a simple test and results seem to be the same from qdrant_client import QdrantClient, models
cl = QdrantClient()
if cl.collection_exists("test"):
cl.delete_collection("test")
cl.create_collection(
"test", vectors_config=models.VectorParams(size=2, distance=models.Distance.COSINE)
)
cl.upsert(
"test",
points=[
models.PointStruct(
id=1,
vector=[0.2, 0.3],
payload={"a": [1, 2, 3, 4]},
),
models.PointStruct(id=2, vector=[0.3, 0.2], payload={"a": [2, 3, 4, 5]}),
],
)
cl.create_payload_index("test", "a", models.PayloadSchemaType.INTEGER, wait=True)
import pprint
pprint.pprint(cl.query_points("test", query=models.OrderByQuery(order_by="a")).points)Output: [ScoredPoint(id=1, version=0, score=0.0, payload={'a': [1, 2, 3, 4]}, vector=None, shard_key=None, order_value=1),
ScoredPoint(id=2, version=0, score=0.0, payload={'a': [2, 3, 4, 5]}, vector=None, shard_key=None, order_value=2),
ScoredPoint(id=1, version=0, score=0.0, payload={'a': [1, 2, 3, 4]}, vector=None, shard_key=None, order_value=2),
ScoredPoint(id=2, version=0, score=0.0, payload={'a': [2, 3, 4, 5]}, vector=None, shard_key=None, order_value=3),
ScoredPoint(id=1, version=0, score=0.0, payload={'a': [1, 2, 3, 4]}, vector=None, shard_key=None, order_value=3),
ScoredPoint(id=2, version=0, score=0.0, payload={'a': [2, 3, 4, 5]}, vector=None, shard_key=None, order_value=4),
ScoredPoint(id=1, version=0, score=0.0, payload={'a': [1, 2, 3, 4]}, vector=None, shard_key=None, order_value=4),
ScoredPoint(id=2, version=0, score=0.0, payload={'a': [2, 3, 4, 5]}, vector=None, shard_key=None, order_value=5)]And `QdrantClient(":memory:") output with this PR is: [ScoredPoint(id=1, version=0, score=0.0, payload={'a': [1, 2, 3, 4]}, vector=None, shard_key=None, order_value=None),
ScoredPoint(id=1, version=0, score=0.0, payload={'a': [1, 2, 3, 4]}, vector=None, shard_key=None, order_value=None),
ScoredPoint(id=2, version=0, score=0.0, payload={'a': [2, 3, 4, 5]}, vector=None, shard_key=None, order_value=None),
ScoredPoint(id=1, version=0, score=0.0, payload={'a': [1, 2, 3, 4]}, vector=None, shard_key=None, order_value=None),
ScoredPoint(id=2, version=0, score=0.0, payload={'a': [2, 3, 4, 5]}, vector=None, shard_key=None, order_value=None),
ScoredPoint(id=1, version=0, score=0.0, payload={'a': [1, 2, 3, 4]}, vector=None, shard_key=None, order_value=None),
ScoredPoint(id=2, version=0, score=0.0, payload={'a': [2, 3, 4, 5]}, vector=None, shard_key=None, order_value=None),
ScoredPoint(id=2, version=0, score=0.0, payload={'a': [2, 3, 4, 5]}, vector=None, shard_key=None, order_value=None)]However, we're not filling |
|
I addressed it and added a test |
|
✅ can't approve my own PR |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/congruence_tests/test_scroll_order_by.py (1)
131-147: Good test case for deduplication with overlapping valuesThis test effectively validates the deduplication functionality by:
- Creating points with overlapping integer array values (2, 3, 4)
- Verifying consistent behavior across all client types (local, HTTP, gRPC)
Consider enhancing this test with additional edge cases:
fixture_points = [ models.PointStruct(id=1, vector=[], payload={"integer_array": [1, 2, 3, 4]}), models.PointStruct(id=2, vector=[], payload={"integer_array": [2, 3, 4, 5]}), + models.PointStruct(id=3, vector=[], payload={"integer_array": [3, 4, 5, 6]}), + models.PointStruct(id=4, vector=[], payload={"integer_array": []}), + models.PointStruct(id=5, vector=[], payload={"integer_array": [4, 4, 4]}), ]This would test empty arrays and repeated values within the same array.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
qdrant_client/local/local_collection.py(2 hunks)tests/congruence_tests/test_scroll_order_by.py(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- qdrant_client/local/local_collection.py
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: Python 3.13.x on ubuntu-latest test
- GitHub Check: Python 3.12.x on ubuntu-latest test
- GitHub Check: Python 3.11.x on ubuntu-latest test
- GitHub Check: Python 3.10.x on ubuntu-latest test
- GitHub Check: Python 3.9.x on ubuntu-latest test
🔇 Additional comments (3)
tests/congruence_tests/test_scroll_order_by.py (3)
4-4: Good addition of the to_order_value importThis import is necessary for the new functionality that standardizes payload values for correct comparison, especially when handling array values.
41-55: Updated scroll logic correctly handles order_value and arraysThe changes appropriately refactor the scroll logic in two important ways:
- Using
record.order_valueinstead of direct payload access, which aligns with the deduplication changes- Adding proper handling for both scalar and list payload values, with consistent value conversion
The implementation correctly handles arrays by converting each item and checking if any matches the last seen value, which is essential for the deduplication feature to work properly.
96-98: Clean helper function additionThis helper function follows the same pattern as the existing ones, maintaining code consistency while supporting the new test case for integer arrays.
In congruence with qdrant/qdrant#6233, we deduplicate same value and id points