Proposed
Analytics workloads exhibit predictable patterns that can be exploited for performance gains:
- Repeated Analysis: Users often run similar queries on same datasets
- Incremental Exploration: Drill-down queries share base filters
- Schema Introspection: Dataset profiles are requested multiple times
- Visualization Regeneration: Same charts requested with minor variations
| Issue | Impact | Frequency |
|---|---|---|
| Re-parsing large CSVs | 5-30s delay | Every query |
| Re-executing identical queries | Wasted compute | Common |
| Re-transferring unchanged results | Token waste | Every round-trip |
| Re-generating same visualizations | CPU + time | Chart refreshes |
| Cold starts on dataset access | Latency | First access |
| Metric | Current | Target | Improvement |
|---|---|---|---|
| Repeated query latency | Full execution | <100ms | 100x |
| Dataset re-parse time | 5-30s | 0s (cached) | ∞ |
| Cache hit ratio | 0% | >60% | New capability |
| Memory overhead | N/A | <1GB | Bounded |
Implement a multi-tier caching system with query optimization, result caching, and intelligent prefetching.
┌─────────────────────────────────────────────────────────────────────────────┐
│ Multi-Tier Caching System │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ TIER 1: IN-MEMORY CACHE (L1) │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ • Dataset schemas │ │
│ │ • Parsed dataframes (polars LazyFrames) │ │
│ │ • Query execution plans │ │
│ │ TTL: 5-15 minutes Max Size: 512MB Eviction: LRU │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ TIER 2: LOCAL DISK CACHE (L2) │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ • Query results (Parquet format) │ │
│ │ • Serialized dataframes │ │
│ │ • Visualization outputs │ │
│ │ TTL: 1-24 hours Max Size: 5GB Eviction: LFU │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ TIER 3: QUERY PLAN CACHE │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ • Optimized execution plans │ │
│ │ • Compiled SQL expressions │ │
│ │ • Predicate pushdown results │ │
│ │ TTL: Session-based Max Size: 100MB Eviction: FIFO │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
def generate_cache_key(query_spec: dict) -> str:
"""
Deterministic cache key generation
"""
components = [
# File checksums (mtime + size based)
hash_files(query_spec['files']),
# Normalized query
normalize_query(query_spec['query']),
# Engine and options
query_spec.get('engine', 'auto'),
query_spec.get('return_limit', 100),
# Result format (different formats = different cache entries)
query_spec.get('return_format', 'json'),
]
return hashlib.sha256(
'|'.join(str(c) for c in components).encode()
).hexdigest()[:32]
def hash_files(file_paths: list) -> str:
"""
Fast file fingerprint using mtime and size
"""
fingerprints = []
for path in sorted(file_paths):
stat = os.stat(path)
fingerprints.append(f"{path}:{stat.st_mtime}:{stat.st_size}")
return hashlib.sha256('|'.join(fingerprints).encode()).hexdigest()[:16]@dataclass
class DatasetCacheEntry:
file_path: str
fingerprint: str # mtime + size hash
schema: Schema
row_count: int
column_stats: Dict[str, ColumnStats]
sample_data: pd.DataFrame
parsed_lazyframe: Optional[pl.LazyFrame]
cached_at: datetime
access_count: int
last_accessed: datetime
@property
def is_stale(self) -> bool:
# Check if file has been modified
current_stat = os.stat(self.file_path)
current_fingerprint = f"{current_stat.st_mtime}:{current_stat.st_size}"
return current_fingerprint != self.fingerprint┌─────────────────────────────────────────────────────────────────┐
│ Query Optimization Flow │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Input Query │
│ │ │
│ ▼ │
│ ┌─────────────────────┐ │
│ │ 1. Parse & Normalize │ • Standardize SQL/expr syntax │
│ └──────────┬──────────┘ • Extract predicates │
│ │ │
│ ▼ │
│ ┌─────────────────────┐ │
│ │ 2. Check L1 Cache │ • Exact match? Return cached │
│ └──────────┬──────────┘ • Partial match? Use as base │
│ │ │
│ ▼ │
│ ┌─────────────────────┐ │
│ │ 3. Optimize Plan │ • Predicate pushdown │
│ └──────────┬──────────┘ • Column pruning │
│ │ • Partition elimination │
│ ▼ │
│ ┌─────────────────────┐ │
│ │ 4. Execute │ • Use cached dataset if available │
│ └──────────┬──────────┘ • Stream if needed │
│ │ │
│ ▼ │
│ ┌─────────────────────┐ │
│ │ 5. Cache Result │ • Store in L1 (small results) │
│ └──────────┬──────────┘ • Store in L2 (large results) │
│ │ • Update metadata │
│ ▼ │
│ Output Result │
│ │
└─────────────────────────────────────────────────────────────────┘
-- Original Query
SELECT category, SUM(revenue)
FROM sales
WHERE date >= '2024-01-01'
GROUP BY category;
-- Optimized (Polars LazyFrame)
(df
.lazy()
.filter(pl.col('date') >= '2024-01-01') # Pushdown to scan
.group_by('category')
.agg(pl.col('revenue').sum())
.collect())# Original: Load all columns
df = pl.read_csv('large_file.csv')
result = df.select(['name', 'revenue'])
# Optimized: Scan only needed columns
df = pl.scan_csv('large_file.csv', columns=['name', 'revenue'])
result = df.collect()# User queries:
# Q1: Total revenue by category
# Q2: Total revenue by category (with date filter)
# Q1 executes full scan → caches result
# Q2 uses cached aggregated data + applies filter → fastinterface CacheManager {
// L1: In-memory cache
l1Cache: Map<string, L1CacheEntry>;
l1MaxSize: number;
l1CurrentSize: number;
// L2: Disk cache
l2CachePath: string;
l2MaxSize: number;
// Operations
get<T>(key: string, tier: 'L1' | 'L2' | 'any'): Promise<T | null>;
set<T>(key: string, value: T, tier: 'L1' | 'L2', ttl: number): Promise<void>;
invalidate(pattern: string): Promise<number>;
clear(): Promise<void>;
}
class AnalyticsCacheManager implements CacheManager {
private l1Cache: Map<string, CacheEntry> = new Map();
private l1MaxSize = 512 * 1024 * 1024; // 512MB
private l2CachePath: string;
async get<T>(key: string, tier: 'L1' | 'L2' | 'any'): Promise<T | null> {
// Try L1 first
if (tier !== 'L2') {
const l1Entry = this.l1Cache.get(key);
if (l1Entry && !this.isExpired(l1Entry)) {
l1Entry.lastAccessed = new Date();
l1Entry.accessCount++;
return l1Entry.value as T;
}
}
// Try L2
if (tier !== 'L1') {
const l2Path = path.join(this.l2CachePath, `${key}.parquet`);
if (await fs.access(l2Path).then(() => true).catch(() => false)) {
const result = await pl.read_parquet(l2Path);
// Promote to L1 if small enough
if (result.estimated_size() < 10 * 1024 * 1024) {
await this.set(key, result, 'L1', 300);
}
return result as T;
}
}
return null;
}
async set<T>(key: string, value: T, tier: 'L1' | 'L2', ttl: number): Promise<void> {
if (tier === 'L1') {
// Check memory limits
const size = this.estimateSize(value);
if (this.l1CurrentSize + size > this.l1MaxSize) {
await this.evictL1(size);
}
this.l1Cache.set(key, {
value,
size,
createdAt: new Date(),
expiresAt: new Date(Date.now() + ttl * 1000),
lastAccessed: new Date(),
accessCount: 0,
});
this.l1CurrentSize += size;
} else {
// L2: Write to parquet
const l2Path = path.join(this.l2CachePath, `${key}.parquet`);
await value.write_parquet(l2Path, { compression: 'snappy' });
}
}
private async evictL1(requiredSpace: number): Promise<void> {
// LRU eviction
const entries = Array.from(this.l1Cache.entries())
.sort((a, b) => a[1].lastAccessed.getTime() - b[1].lastAccessed.getTime());
let freed = 0;
for (const [key, entry] of entries) {
if (freed >= requiredSpace) break;
this.l1Cache.delete(key);
this.l1CurrentSize -= entry.size;
freed += entry.size;
// If recently accessed, move to L2
if (entry.accessCount > 5) {
await this.moveToL2(key, entry.value);
}
}
}
}class PrefetchManager:
"""
Predict and prefetch likely-needed data
"""
def __init__(self):
self.query_patterns = {}
self.access_graph = defaultdict(list)
def record_access(self, dataset: str, query: dict):
"""Track query patterns for prediction"""
pattern = self.extract_pattern(query)
self.query_patterns[dataset].append({
'pattern': pattern,
'timestamp': time.time(),
'columns': query.get('columns', []),
})
def predict_next_queries(self, dataset: str, current_query: dict) -> list:
"""Predict what user might query next"""
predictions = []
# Pattern 1: Time series → predict next time bucket
if 'date' in current_query.get('filters', {}):
predictions.append(self.extend_time_range(current_query))
# Pattern 2: Aggregation → predict drill-down
if current_query.get('aggregation'):
predictions.append(self.add_dimension(current_query))
# Pattern 3: Top-N → predict next N
if current_query.get('limit'):
predictions.append(self.increase_limit(current_query))
return predictions
async def prefetch(self, dataset: str, query: dict):
"""Execute predicted queries in background"""
predictions = self.predict_next_queries(dataset, query)
for prediction in predictions[:3]: # Max 3 prefetches
cache_key = generate_cache_key(prediction)
if not await cache.exists(cache_key):
# Execute in background
asyncio.create_task(
self.execute_and_cache(prediction, cache_key)
)@dataclass
class VisualizationCache:
chart_type: str
data_hash: str
options_hash: str
base64_image: str
width: int
height: int
created_at: datetime
@classmethod
def from_request(cls, data: pd.DataFrame, chart_type: str, options: dict):
data_hash = hash_dataframe(data)
options_hash = hash_dict(options)
return cls(
chart_type=chart_type,
data_hash=data_hash,
options_hash=options_hash,
base64_image=None, # Generated on demand
width=options.get('width', 800),
height=options.get('height', 600),
created_at=datetime.now(),
)
def get_or_create_chart(data: pd.DataFrame, chart_type: str, options: dict) -> str:
"""Get cached chart or generate new one"""
cache_key = VisualizationCache.from_request(data, chart_type, options)
# Check cache
cached = viz_cache.get(cache_key.data_hash + cache_key.options_hash)
if cached:
return cached.base64_image
# Generate new chart
fig = create_chart(data, chart_type, options)
buf = io.BytesIO()
fig.savefig(buf, format='png', dpi=100)
base64_image = base64.b64encode(buf.getvalue()).decode()
# Cache it
cache_key.base64_image = base64_image
viz_cache.set(cache_key.data_hash + cache_key.options_hash, cache_key)
return base64_imageclass CacheInvalidator:
"""
Intelligent cache invalidation
"""
def on_file_change(self, file_path: str):
"""Called when a dataset file is modified"""
# Invalidate all caches for this file
pattern = f"*{hashlib.md5(file_path.encode()).hexdigest()[:8]}*"
invalidated = cache.invalidate(pattern)
logger.info(f"Invalidated {invalidated} cache entries for {file_path}")
def on_query_error(self, query: dict, error: Exception):
"""Handle query errors - may invalidate related caches"""
if isinstance(error, SchemaMismatchError):
# Schema changed - invalidate all dataset caches
self.on_file_change(query['files'][0])
def scheduled_cleanup(self):
"""Periodic cleanup of stale cache entries"""
# Remove expired entries
for key, entry in list(l1_cache.items()):
if entry.is_expired():
l1_cache.delete(key)
# Compact L2 cache if over size limit
if l2_cache.size > l2_cache.max_size:
l2_cache.evict_lfu(target_size=l2_cache.max_size * 0.8)interface CacheMetrics {
// Hit rates
l1HitRate: number;
l2HitRate: number;
overallHitRate: number;
// Latency
l1LatencyMs: number; // Should be <10ms
l2LatencyMs: number; // Should be <100ms
missLatencyMs: number; // Full execution time
// Usage
l1SizeBytes: number;
l2SizeBytes: number;
l1EntryCount: number;
l2EntryCount: number;
// Savings
queriesCached: number;
executionTimeSavedMs: number;
computeCostSaved: number;
}
class CacheMetricsCollector {
private metrics: CacheMetrics = {
l1HitRate: 0,
l2HitRate: 0,
overallHitRate: 0,
l1LatencyMs: 0,
l2LatencyMs: 0,
missLatencyMs: 0,
l1SizeBytes: 0,
l2SizeBytes: 0,
l1EntryCount: 0,
l2EntryCount: 0,
queriesCached: 0,
executionTimeSavedMs: 0,
computeCostSaved: 0,
};
recordAccess(tier: 'L1' | 'L2' | 'miss', latencyMs: number): void {
// Update hit rates and latency metrics
// Emit to monitoring system
}
getReport(): CacheMetrics {
return { ...this.metrics };
}
}- Sub-100ms Repeated Queries: 100x+ speedup for cached results
- Reduced Compute Costs: Fewer full query executions
- Better User Experience: Instant responses for common queries
- Predictable Performance: Cache hits have consistent latency
- Offline Capability: Cached results available without re-execution
- Cost Savings: Reduced CPU/memory usage on code-server
- Memory Overhead: Up to 512MB L1 + 5GB L2 cache storage
- Cache Invalidation Complexity: Must handle file changes correctly
- Stale Data Risk: Users may see outdated results
- Warm-up Time: Cold cache requires initial full executions
- Monitoring Overhead: Need to track cache health and hit rates
- Storage Requirements: Additional disk space for L2 cache
- Configuration Complexity: More tuning parameters
- Debugging Complexity: Cached results may hide issues
- Implement L1 in-memory cache for schemas
- Cache parsed LazyFrames
- Simple LRU eviction
- Basic cache key generation
- L2 disk cache with Parquet storage
- Query result caching
- Visualization caching
- Cache metrics and monitoring
- Query plan optimization
- Smart prefetching
- Incremental query refinement
- Cross-query optimization
- Machine learning for prefetch prediction
- Automatic cache sizing
- Query result compression
- Distributed caching (if needed)
| Metric | Target | Measurement |
|---|---|---|
| Cache hit rate | >60% | Access logs |
| L1 hit latency | <10ms | Timing metrics |
| L2 hit latency | <100ms | Timing metrics |
| Memory overhead | <512MB | System metrics |
| Disk overhead | <5GB | Storage metrics |
| Invalidation accuracy | 100% | Stale data incidents |
| Query speedup (cached) | 50x | A/B testing |
{
"cache": {
"l1": {
"enabled": true,
"maxSizeMB": 512,
"defaultTTLSeconds": 900,
"evictionPolicy": "lru"
},
"l2": {
"enabled": true,
"path": "/tmp/mcp-cache",
"maxSizeGB": 5,
"defaultTTLHours": 24,
"compression": "snappy"
},
"prefetch": {
"enabled": true,
"maxPrefetchQueries": 3,
"confidenceThreshold": 0.7
},
"invalidation": {
"checkIntervalSeconds": 30,
"aggressiveCleanup": false
}
}
}- ADR-001: Analytics Architecture
- ADR-002: Context Window Management Strategy
- MCP-SPECIFICATION: Detailed tool specifications
| Date | Author | Decision | Rationale |
|---|---|---|---|
| 2026-03-17 | Sisyphus | Adopt multi-tier caching with query optimization | Essential for performance at scale |
Status Legend:
- Proposed: Under review
- Accepted: Approved for implementation
- Deprecated: Replaced by newer ADR
- Superseded: See referenced ADR