Skip to content

Access documents by id will throw an error when setting the BaseDoc field type to "Dict" #1690

@KenyonY

Description

@KenyonY

Initial Checks

  • I have read and followed the docs and still think this is a bug

Description

When runnning this: (reference: access-documents-by-id )

import numpy as np
from docarray import BaseDoc, DocList
from docarray.index import InMemoryExactNNIndex
from docarray.typing import NdArray
from typing import Dict, Optional


class TextDoc(BaseDoc):
    embedding: NdArray[128]
    text: str
    metadata: Optional[Dict]


db = InMemoryExactNNIndex[TextDoc]()

data = DocList[TextDoc](
    TextDoc(id=str(i),
            embedding=np.random.rand(128),
            text=f'query {i}',
            metadata={"data": f"metadata_{i}"},
            )
    for i in range(3)
)

db.index(data)

doc = db['0']  # get doc by id
print(doc)

I got this error:

Traceback (most recent call last):
  File "/mnt/d/github/vector-search/Examples/issue_access_by_id.py", line 27, in <module>
    doc = db['0']  # get doc by id
  File "/home/kunyuan/miniconda3/envs/py10/lib/python3.10/site-packages/docarray/index/abstract.py", line 365, in __getitem__
    if issubclass(type_, AnyDocArray) and isinstance(doc_sequence[0], Dict):
  File "/home/kunyuan/miniconda3/envs/py10/lib/python3.10/abc.py", line 123, in __subclasscheck__
    return _abc_subclasscheck(cls, subclass)
TypeError: issubclass() arg 1 must be a class

After a preliminary investigation, it was found that everything works fine if the data type of the metadata field is not a Dict.

Python, Pydantic & OS Version

docarray: 0.35.0
python: python 3.10.6

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions