Skip to content

to_ray_dataset as a first citizen method for retrieval_job #5568

@ntkathole

Description

@ntkathole

we probably can make it a first citizen method for retrieval_job in future

Originally posted by @HaoXuAI in #5526 (comment)


from feast import FeatureStore

store = FeatureStore(".")

# Get historical features as Ray Dataset
ds = store.get_historical_features(
    entity_df=entity_df,
    features=["embeddings:embedding"],
).to_ray_dataset()  # ← Proposed new method

# Now use Ray Data API directly
ds = ds.map_batches(custom_processing)
ds = ds.repartition(100)
predictions = ds.map_batches(ModelInference, num_gpus=1)
class FeatureStore:
    
    def to_ray_dataset(
        self,
        source: Optional[DataSource] = None,
        feature_view: Optional[FeatureView] = None,
        start_date: Optional[datetime] = None,
        end_date: Optional[datetime] = None,
    ) -> ray.data.Dataset:
        """Convert Feast data source or feature view to Ray Dataset."""
        ...
    
class RetrievalJob:
    
    def to_ray_dataset(self) -> ray.data.Dataset:
        """Convert retrieval result to Ray Dataset instead of Pandas."""
        ...

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions