I would like to create a typed DataFrame from a Pydantic BaseModel class, let's call it MyModel that has Optional fields. As I create multiple instances of MyModel, some will have Optional fields with None values, and if I initialize a DataFrame with such rows, they will may have inconsistent column dtypes. I'd like thus to cast Optional[TypeX] to TypeX, e.g.:
import pydantic
import pandas as pd
import numpy as np
from typing import Optional
class MyModel(pydantic.BaseModel):
thisfield: int
thatfield: Optional[str]
...
col_types = {kk: ff.annotation for kk, ff in MyModel.model_fields.items()}
pd.DataFrame(np.empty(0, dtype=[tuple(tt) for tt in col_types.items()]))
This fails with TypeError: Cannot interpret 'typing.Optional[str]' as a data type.
I need a function or method of Optional[X] -> X. Any suggestions other than using repr with regex?