Skip to content

The feast server crashes when starting multiple workers with SQL registry #5784

@daseindev1

Description

@daseindev1

Expected Behavior

Running the Feast feature server with multiple Gunicorn workers (e.g. feast serve -w 4) should work reliably with a SQL Registry backend, including periodic registry refresh, without raising SQLAlchemy errors.

Current Behavior

When starting the feature server with multiple workers (e.g. feast serve -w 4), Feast intermittently crashes the background registry refresh thread with:

feast-serve-1  | INFO:fastapi:Auth type: AuthManagerType.NONE
feast-serve-1  | [2025-11-24 03:58:45 +0000] [1] [INFO] Starting gunicorn 23.0.0
feast-serve-1  | [2025-11-24 03:58:45 +0000] [1] [INFO] Listening at: http://0.0.0.0:6556 (1)
feast-serve-1  | [2025-11-24 03:58:45 +0000] [1] [INFO] Using worker: uvicorn_worker.UvicornWorker
feast-serve-1  | /usr/local/lib/python3.12/site-packages/gunicorn/arbiter.py:592: DeprecationWarning: This process (pid=1) is multi-threaded, use of fork() may lead to deadlocks in the child.
feast-serve-1  |   pid = os.fork()
feast-serve-1  | [2025-11-24 03:58:45 +0000] [31] [INFO] Booting worker with pid: 31
feast-serve-1  | /usr/local/lib/python3.12/site-packages/websockets/legacy/__init__.py:6: DeprecationWarning: websockets.legacy is deprecated; see https://websockets.readthedocs.io/en/stable/howto/upgrade.html for upgrade instructions
feast-serve-1  |   warnings.warn(  # deprecated in 14.0 - 2024-11-09
feast-serve-1  | /usr/local/lib/python3.12/site-packages/uvicorn/protocols/websockets/websockets_impl.py:17: DeprecationWarning: websockets.server.WebSocketServerProtocol is deprecated
feast-serve-1  |   from websockets.server import WebSocketServerProtocol
feast-serve-1  | [2025-11-24 03:58:45 +0000] [31] [INFO] Started server process [31]
feast-serve-1  | [2025-11-24 03:58:45 +0000] [31] [INFO] Waiting for application startup.
feast-serve-1  | [2025-11-24 03:58:45 +0000] [31] [INFO] Application startup complete.
feast-serve-1  | [2025-11-24 03:58:45 +0000] [33] [INFO] Booting worker with pid: 33
feast-serve-1  | /usr/local/lib/python3.12/site-packages/websockets/legacy/__init__.py:6: DeprecationWarning: websockets.legacy is deprecated; see https://websockets.readthedocs.io/en/stable/howto/upgrade.html for upgrade instructions
feast-serve-1  |   warnings.warn(  # deprecated in 14.0 - 2024-11-09
feast-serve-1  | /usr/local/lib/python3.12/site-packages/uvicorn/protocols/websockets/websockets_impl.py:17: DeprecationWarning: websockets.server.WebSocketServerProtocol is deprecated
feast-serve-1  |   from websockets.server import WebSocketServerProtocol
feast-serve-1  | [2025-11-24 03:58:45 +0000] [33] [INFO] Started server process [33]
feast-serve-1  | [2025-11-24 03:58:45 +0000] [33] [INFO] Waiting for application startup.
feast-serve-1  | [2025-11-24 03:58:45 +0000] [33] [INFO] Application startup complete.
feast-serve-1  | [2025-11-24 03:58:45 +0000] [35] [INFO] Booting worker with pid: 35
feast-serve-1  | /usr/local/lib/python3.12/site-packages/websockets/legacy/__init__.py:6: DeprecationWarning: websockets.legacy is deprecated; see https://websockets.readthedocs.io/en/stable/howto/upgrade.html for upgrade instructions
feast-serve-1  |   warnings.warn(  # deprecated in 14.0 - 2024-11-09
feast-serve-1  | /usr/local/lib/python3.12/site-packages/uvicorn/protocols/websockets/websockets_impl.py:17: DeprecationWarning: websockets.server.WebSocketServerProtocol is deprecated
feast-serve-1  |   from websockets.server import WebSocketServerProtocol
feast-serve-1  | [2025-11-24 03:58:45 +0000] [35] [INFO] Started server process [35]
feast-serve-1  | [2025-11-24 03:58:45 +0000] [35] [INFO] Waiting for application startup.
feast-serve-1  | [2025-11-24 03:58:45 +0000] [35] [INFO] Application startup complete.
feast-serve-1  | [2025-11-24 03:58:45 +0000] [36] [INFO] Booting worker with pid: 36
feast-serve-1  | /usr/local/lib/python3.12/site-packages/websockets/legacy/__init__.py:6: DeprecationWarning: websockets.legacy is deprecated; see https://websockets.readthedocs.io/en/stable/howto/upgrade.html for upgrade instructions
feast-serve-1  |   warnings.warn(  # deprecated in 14.0 - 2024-11-09
feast-serve-1  | /usr/local/lib/python3.12/site-packages/uvicorn/protocols/websockets/websockets_impl.py:17: DeprecationWarning: websockets.server.WebSocketServerProtocol is deprecated
feast-serve-1  |   from websockets.server import WebSocketServerProtocol
feast-serve-1  | [2025-11-24 03:58:45 +0000] [36] [INFO] Started server process [36]
feast-serve-1  | [2025-11-24 03:58:45 +0000] [36] [INFO] Waiting for application startup.
feast-serve-1  | [2025-11-24 03:58:45 +0000] [36] [INFO] Application startup complete.
feast-serve-1  | Exception in thread Thread-18:
feast-serve-1  | Traceback (most recent call last):
feast-serve-1  |   File "/usr/local/lib/python3.12/threading.py", line 1075, in _bootstrap_inner
feast-serve-1  |     self.run()
feast-serve-1  |   File "/usr/local/lib/python3.12/threading.py", line 1433, in run
feast-serve-1  |     self.function(*self.args, **self.kwargs)
feast-serve-1  |   File "/usr/local/lib/python3.12/site-packages/feast/feature_server.py", line 209, in async_refresh
feast-serve-1  |     registry_proto = store.registry.proto()
feast-serve-1  |                      ^^^^^^^^^^^^^^^^^^^^^^
feast-serve-1  |   File "/usr/local/lib/python3.12/site-packages/feast/infra/registry/sql.py", line 878, in proto
feast-serve-1  |     process_project(project)
feast-serve-1  |   File "/usr/local/lib/python3.12/site-packages/feast/infra/registry/sql.py", line 861, in process_project
feast-serve-1  |     objs: List[Any] = lister(project_name, allow_cache=False)  # type: ignore
feast-serve-1  |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
feast-serve-1  |   File "/usr/local/lib/python3.12/site-packages/feast/infra/registry/caching_registry.py", line 207, in list_on_demand_feature_views
feast-serve-1  |     return self._list_on_demand_feature_views(project, tags)
feast-serve-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
feast-serve-1  |   File "/usr/local/lib/python3.12/site-packages/feast/infra/registry/sql.py", line 646, in _list_on_demand_feature_views
feast-serve-1  |     return self._list_objects(
feast-serve-1  |            ^^^^^^^^^^^^^^^^^^^
feast-serve-1  |   File "/usr/local/lib/python3.12/site-packages/feast/infra/registry/sql.py", line 1094, in _list_objects
feast-serve-1  |     rows = conn.execute(stmt).all()
feast-serve-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^
feast-serve-1  |   File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/result.py", line 1384, in all
feast-serve-1  |     return self._allrows()
feast-serve-1  |            ^^^^^^^^^^^^^^^
feast-serve-1  |   File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/result.py", line 546, in _allrows
feast-serve-1  |     make_row = self._row_getter
feast-serve-1  |                ^^^^^^^^^^^^^^^^
feast-serve-1  |   File "/usr/local/lib/python3.12/site-packages/sqlalchemy/util/langhelpers.py", line 1338, in __get__
feast-serve-1  |     obj.__dict__[self.__name__] = result = self.fget(obj)
feast-serve-1  |                                            ^^^^^^^^^^^^^^
feast-serve-1  |   File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/result.py", line 471, in _row_getter
feast-serve-1  |     key_to_index = metadata._key_to_index
feast-serve-1  |                    ^^^^^^^^^^^^^^^^^^^^^^
feast-serve-1  |   File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/cursor.py", line 1366, in _key_to_index
feast-serve-1  |     self._we_dont_return_rows()
feast-serve-1  |   File "/usr/local/lib/python3.12/site-packages/sqlalchemy/engine/cursor.py", line 1346, in _we_dont_return_rows
feast-serve-1  |     raise exc.ResourceClosedError(
feast-serve-1  | sqlalchemy.exc.ResourceClosedError: This result object does not return rows. It has been closed automatically.

This happens during the periodic registry refresh (async_refresh) when it calls store.registry.proto(), which triggers SQL registry list operations (e.g. listing OnDemandFeatureViews).

Relevant code paths:

Specs:

  • OS: Red Hat 8.1
  • Python 3.12.11
  • Gunicorn 23.0.0
  • feast 0.55.1
  • SQLAlchemy 2.0.44
  • psycopg2-binary 2.9.10

Steps to reproduce

  1. Configure Feast to use the SQL Registry, e.g. (from docs):
    https://github.com/feast-dev/feast/blob/master/docs/reference/registries/sql.md
project: feature_repo
provider: local  # Provider is local because we are defining all stores manually

registry:
  registry_type: sql
  # Use a dedicated schema 'feast' to avoid naming conflicts with PostgreSQL system tables
  path: postgresql+psycopg2://postgres:postgres@postgres:5432/feast?options=-c%20search_path=feast

online_store:
  type: redis
  connection_string: "redis:6379"

offline_store:
  type: file

entity_key_serialization_version: 3
  1. Start the feature server with multiple workers:
    feast --feature-store-yaml feature_store.yaml serve --host 0.0.0.0 --port 18080 -w 4

  2. Wait for a while until multiple workers are spawned

  3. Observe sqlalchemy.exc.ResourceClosedError thrown in the background refresh thread.

  4. (Control) Starting with a single worker (-w 1) does not reproduce the error.

Possible Solution

This looks like a fork-safety / lifecycle issue: the FeatureStore/SqlRegistry (and its SQLAlchemy Engine/pool state) are created before Gunicorn forks workers, then each worker runs a background thread (threading.Timer) that uses the inherited engine/pool, leading to invalid/closed DBAPI cursor/result state.

Potential fixes:

  • Create the FeatureStore/SqlRegistry inside each worker after fork, rather than constructing it in the Gunicorn master and passing it into the app (i.e., avoid pre-fork engine creation).
  • Alternatively, dispose/recreate SQLAlchemy engines after fork (e.g. via a Gunicorn post_fork hook) so each worker has a fresh pool.
  • Documentation note / guardrail: warn against multi-worker mode with SQL Registry unless post-fork initialization is used.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions