Cache updates in chatbot when adding new documents

Question

I'm building a chatbot to answer legal-related questions. I'm facing an issue with caching questions and responses — the goal is to retrieve an answer when someone asks a similar question that's already been saved. However, when I add new documents to the chatbot, the previously cached questions don't include information from these new files, so the responses become outdated and don't get updated accordingly.

I've thought of two solutions:

When a cached question is asked, the system checks whether the number of information files has changed. If so, it fetches data from the newly added files. If there is relevant content, it generates a new answer that combines the new information with the previous response, then updates the cache.
When new files are added, a separate process is triggered to update the cached responses.

Both solutions raise concerns about the chatbot's performance. What approach would you recommend to keep the cache up-to-date without degrading performance?

InsertCheesyLine · Accepted Answer · 2025-07-15 06:32:06Z

Here is a very naive approach to such a scenario. Our assumptions are:

Your vector store initially contains N documents and any queries to which relevant context exists is cached.
When the vector store is updated (i.e count of documents change) you want the cached items to be updated.

from langchain_core.caches import BaseCache, RETURN_VAL_TYPE
from typing import Any, Dict, Optional


class MetadataAwareCache(BaseCache):
    def __init__(self, doc_count: int):
        super().__init__()
        self._cache = {}
        self._doc_count = doc_count

    # Initially set a document count for reference
    def update_document_count(self, doc_count: int):
        if self._doc_count == doc_count:
            return
        self._doc_count = doc_count
        for args, _old_response in self._cache.items():
            prompt, llm_string = args[0], args[1]
            response, metadata = self.regenerate_cache(prompt, llm_string)
            self.update(prompt, llm_string, response, metadata)

    def regenerate_cache(self, prompt: str, llm_input: str):
        # Regenerate response with new information
        response = "New LLM Response"
        metadata = {}
        return response, metadata

    # Cache lookup
    def lookup(self, prompt: str, llm_string: str) -> Optional[RETURN_VAL_TYPE]:
        cache_entry = self._cache.get((prompt, llm_string))
        if cache_entry:
            return cache_entry["value"]
        return None

    # Update cache
    def update(
        self,
        prompt: str,
        llm_string: str,
        value: RETURN_VAL_TYPE,
        metadata: Dict[str, Any] = None,
    ) -> None:
        self._cache[(prompt, llm_string)] = {
            "value": value,
            "metadata": metadata or {},
        }

To use the cache

from langchain.globals import set_llm_cache

cache = MetadataAwareCache()
set_llm_cache(cache)

Considerations:

If you are worried about performance and potentially the cost of re-generating the cache you can consider the frequency at which new information is likely to be added.
If it is far and between it could be better to expire cache often rather than pay the cost to re-create cache that might not be re-used.
If the updates are more often, you could make it a scheduled process.
Alternatively you can add more metadata for each document and re-create only those cache that fall under such category.

Collectives™ on Stack Overflow

Cache updates in chatbot when adding new documents

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related