Skip to content

Incorrect path in file data source to s3 bucket #4873

@nboyarkin

Description

@nboyarkin

Hi there! I was trying to implenet a feature store, created repo, added feature.parquet to the s3 bucket, made 'feast apply', then tried to get features in notebook via feature store. But it seems to be bugged.

Expected Behavior

DF with features.

Current Behavior

Error:
An error occurred while calling the read_parquet method registered to the pandas backend.
Original Message: [WinError 123] Failed querying information for path 'c:/Users/nboyarkin/Downloads/scm_forecast-1/notebooks/s3:/analytics-ds-dev-spark-upload-files/features/year.parquet'.

Steps to reproduce

fs = FeatureStore(fs_yaml_file='C:/Users/nboyarkin/Downloads/feast.yaml')

entity_df = pd.DataFrame.from_dict(
{
# entity's join key -> entity values
"store_id": [12,],
"product_id": [27279,],
# "event_timestamp" (reserved key) -> timestamps
"event_timestamp": [
datetime(2024, 11, 1),
],
}
)

training_df = fs.get_historical_features(
entity_df=entity_df,
features=[
"calendar_stats:year",
],
).to_df()

Specifications

  • Version: 0.42.0
  • Platform: windows
  • Subsystem:

Possible Solution

In feast\infra\offline_stores\dask.py, line 529, change:
if not Path(data_source.path).is_absolute()
to
if not Path(data_source.path).is_absolute() and not Path(data_source.path).parts[0] == 's3:':

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions