Skip to content

bug:postgres source invalid pull_latest_from_table_or_query query output #3804

@david-dest01

Description

@david-dest01

I would submit a PR but I'm on an M1 and had some issues getting the env running. Submitting a proposed solution for the sake of available time.

Expected Behavior

When using PostgreSQLSource, you should be able to use a table successfully. When running feast masterialize [START_DATE] [END_DATE], as a user I would expect the table to be selected on successfully with no errors.

    source=PostgreSQLSource(
        name="source_user_events_v0",
        timestamp_field="created_at",
        table="offline_store.user_events"
    )

Current Behavior

pull_latest_from_table_or_query attempts to pull the latest data using the outputted string from get_table_query_string. However on line 97 of postgres.py the query includes open and close parens assuming an inner query. These params should be removed from the query and handled by get_table_query_string (which it is).

Steps to reproduce

Create a source as part of a feature view. Reference an existing table.

    source=PostgreSQLSource(
        name="source_user_events_v0",
        timestamp_field="created_at",
        table="offline_store.user_events"
    )

execute feast materialize [START_DATE] [END_DATE]

expected output.

root@3b9fc17aa598:/usr/app# feast materialize 2023-10-16T00:00:00  2023-10-18T00:00:00
/usr/local/lib/python3.9/site-packages/feast/repo_config.py:233: RuntimeWarning: `entity_key_serialization_version` is either not specified in the feature_store.yaml, or is specified to a value <= 1.This serialization version may cause errors when trying to write fields with the `Long` data type into the online store. Specifying `entity_key_serialization_version` to 2 is recommended for new projects. 
  warnings.warn(
Materializing 1 feature views from 2023-10-16 00:00:00+00:00 to 2023-10-18 00:00:00+00:00 into the postgres online store.

fv_user_events_v0:
Traceback (most recent call last):
  File "/usr/local/bin/feast", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/feast/cli.py", line 546, in materialize_command
    store.materialize(
  File "/usr/local/lib/python3.9/site-packages/feast/usage.py", line 299, in wrapper
    raise exc.with_traceback(traceback)
  File "/usr/local/lib/python3.9/site-packages/feast/usage.py", line 288, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/feast/feature_store.py", line 1395, in materialize
    provider.materialize_single_feature_view(
  File "/usr/local/lib/python3.9/site-packages/feast/infra/passthrough_provider.py", line 254, in materialize_single_feature_view
    raise e
  File "/usr/local/lib/python3.9/site-packages/feast/infra/materialization/local_engine.py", line 156, in _materialize_one
    table = offline_job.to_arrow()
  File "/usr/local/lib/python3.9/site-packages/feast/infra/offline_stores/offline_store.py", line 122, in to_arrow
    return self._to_arrow_internal(timeout=timeout)
  File "/usr/local/lib/python3.9/site-packages/feast/infra/offline_stores/contrib/postgres_offline_store/postgres.py", line 282, in _to_arrow_internal
    cur.execute(query)
psycopg2.errors.SyntaxError: syntax error at or near ")"
LINE 8:                 FROM (offline_store.user_events) a
                                                       ^

root@3b9fc17aa598:/usr/app# 

Specifications

  • Version: 0.34.1
  • Platform: Linux
  • Subsystem: Ubuntu 20.04

Possible Solution

Update this query from:

        query = f"""
            SELECT
                {b_field_string}
                {f", {repr(DUMMY_ENTITY_VAL)} AS {DUMMY_ENTITY_ID}" if not join_key_columns else ""}
            FROM (
                SELECT {a_field_string},
                ROW_NUMBER() OVER({partition_by_join_key_string} ORDER BY {timestamp_desc_string}) AS _feast_row
                FROM ({from_expression}) a
                WHERE a."{timestamp_field}" BETWEEN '{start_date}'::timestamptz AND '{end_date}'::timestamptz
            ) b
            WHERE _feast_row = 1
            """

to:

        query = f"""
            SELECT
                {b_field_string}
                {f", {repr(DUMMY_ENTITY_VAL)} AS {DUMMY_ENTITY_ID}" if not join_key_columns else ""}
            FROM (
                SELECT {a_field_string},
                ROW_NUMBER() OVER({partition_by_join_key_string} ORDER BY {timestamp_desc_string}) AS _feast_row
                FROM {from_expression} a
                WHERE a."{timestamp_field}" BETWEEN '{start_date}'::timestamptz AND '{end_date}'::timestamptz
            ) b
            WHERE _feast_row = 1
            """

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions