Skip to content

Fix == for Document #1177

@samsja

Description

@samsja

Context

The operator == is not working properly for BaseDocument even though it is implemented at Pydantic BaseModel level.

Example:

class MyDoc(BaseDocument):
    title: str
    tensor: NdArray

a = MyDoc(title='hello', tensor=np.zeros(5), id = 1)
b = MyDoc(title='hello', tensor=np.zeros(5), id = 1)

assert a == b
    assert a == b
  File "pydantic/main.py", line 909, in pydantic.main.BaseModel.__eq__
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Even without Tensor

class MyDoc(BaseDocument):
    title: str

a = MyDoc(title='hello')
b = MyDoc(title='hello')

assert a == b
>>>    assert a == b
  File "pydantic/main.py", line 909, in pydantic.main.BaseModel.__eq__
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

This is only working if the id match as well

from docarray import BaseDocument
from docarray.typing import NdArray

import numpy as np

class MyDoc(BaseDocument):
    title: str

a = MyDoc(title='hello', id = 1)
b = MyDoc(title='hello', id = 1)

assert a == b

DocumentArray

we should support the same in DocumentArray

da1 == da2 the same way python can do it for list, this is one of the reason id should not be checked

da1 == [MyDoc() for _ in range(len(da1))] should also work

Solution

  • We should not look at the value of id when doing ==
  • we should call ( tensor == tensor).all() for the tensor field
  • implement == at da level as well

Metadata

Metadata

Assignees

Labels

good-first-issueSuitable as your first contribution to DocArray!

Type

No type
No fields configured for issues without a type.

Projects

Status
Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions