-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Expected Behavior
When using SQLRegistry with cache enabled, and a query is run with allow_cache=True, the registry should refresh the cache (if necessary) and return the results with minimal latency.
Current Behavior
When the cache expires, any method call with allow_cache=True runs indefinitely, and has to be terminated with Ctrl+C. The stacktrace shows that it always gets stuck on this line:
File ~/mambaforge/envs/nudgerank/lib/python3.12/site-packages/feast/infra/registry/caching_registry.py:433, in CachingRegistry._refresh_cached_registry_if_necessary(self)
431 def _refresh_cached_registry_if_necessary(self):
432 if self.cache_mode == "sync":
--> 433 with self._refresh_lock:
434 if self.cached_registry_proto == RegistryProto():
435 # Avoids the need to refresh the registry when cache is not populated yet
436 # Specially during the __init__ phase
437 # proto() will populate the cache with project metadata if no objects are registered
438 expired = False
This also happens when calling refresh(), which means that the registry can never be refreshed, and the cache is not usable at all. Only allow_cache=False works, which adds latency to every query.
Steps to reproduce
Set up an SQLRegistry and enable caching. Then, run refresh() or any other method with allow_cache=True.
import time
from feast import Entity, FeatureStore, RepoConfig
# Set up Feature Store with SQLRegistry
repo_config = RepoConfig(
project="my_project",
registry={
"registry_type": "sql",
"path": "<db_url>",
# Set short TTL for testing.
"cache_ttl_seconds": 5,
},
...
)
feature_store = FeatureStore(config=repo_config)
# Create and register entity
driver = Entity(name="driver", join_keys=["driver_id"])
feature_store.apply(driver)
# Get entity, and use cache. This succeeds if it is run before the TTL.
feature_store.get_entity("driver", True)
# Let cache expire
time.sleep(6)
# This runs forever
feature_store.get_entity("driver", True)
# Any of these also run forever
feature_store.refresh_registry()
feature_store.get_*(..., True)Specifications
- Version: 0.40.1
- Platform: macOS 14.6.1
- Subsystem:
Possible Solution
This only seems to happen when the cache is refreshed synchronously. Async cache refresh (cache_mode="thread") does not seem to run into this issue, and the cache is refreshed successfully (e.g. when entity definition is modified and get_entity("driver", True) is run again after the TTL). So the current workaround is to specify cache_mode="thread" in the registry config.
However, there are use cases where we want to guarantee that the cache is refreshed after some changes before executing more code, so synchronous refresh is still useful.