Query Result Caching

SQLiter provides optional query result caching to improve performance by reducing database queries for frequently accessed data. The cache uses an LRU (Least Recently Used) eviction strategy and automatically invalidates when data is modified.

Enabling Caching

Caching is opt-in and disabled by default. To enable caching, set cache_enabled=True when creating the database connection:

from sqliter import SqliterDB

db = SqliterDB("my_database.db", cache_enabled=True)

Cache Configuration

Basic Configuration

db = SqliterDB(
    "my_database.db",
    cache_enabled=True,      # Enable caching (required)
    cache_max_size=1000,     # Max cached queries per table (default: 1000)
    cache_ttl=60,            # Time-to-live in seconds (default: None)
    cache_max_memory_mb=100, # Memory limit per table in MB (default: None)
)

Configuration Parameters

cache_enabled (bool, default: False): Enable or disable caching.
cache_max_size (int, default: 1000): Maximum number of cached query results per table. When exceeded, the least recently used entries are evicted.
cache_ttl (int | None, default: None): Default time-to-live for cached results, in seconds. After this time, entries expire and are removed on next access.
cache_max_memory_mb (int | None, default: None): Optional memory limit per table, in megabytes. When exceeded, the least recently used entries are evicted until under the limit.

How Caching Works

Cache Key Generation

Each query generates a unique cache key based on:

Table name
Selected fields
Filter conditions
Order by clauses
Limit and offset values
Fetch mode (single vs list)

This means different queries produce different cache entries:

# These are cached separately
db.select(User).filter(name="Alice").fetch_all()
db.select(User).filter(name="Bob").fetch_all()

# These are also cached separately (different fields)
db.select(User).only("name").fetch_all()
db.select(User).fetch_all()

Cache Invalidation

The cache automatically invalidates when data is modified. Any write operation (insert, update, delete) on a table clears all cached queries for that table:

# Query and cache the result
users = db.select(User).fetch_all()  # Cached

# Insert new data - INVALIDATES cache for User table
db.insert(User(name="Charlie", age=25))

# This query hits the database again (cache was invalidated)
users = db.select(User).fetch_all()

Per-Instance Scope

The cache is tied to the database connection instance and is automatically cleared when:

The connection is closed: db.close()
Exiting a context manager: with SqliterDB(...) as db:
The connection is deleted

with SqliterDB("mydb.db", cache_enabled=True) as db:
    db.select(User).fetch_all()  # Cached
    # Cache is cleared when exiting context
# Cache is empty here

Limitations

Cache Key Type Compatibility

Cache keys are generated by sorting filter conditions to ensure consistency. This requires that filter values be comparable with each other.

❌ This will fail:

# First query succeeds
db.select(User).filter(name="Alice", age=30).fetch_all()

# Second query fails at fetch time when generating cache key
db.select(User).filter(name=42, age=30).fetch_all()  # ValueError!

The second query fails when fetch_all() is called because Python cannot sort mixed types (strings vs numbers) when generating the cache key. The filter() call itself succeeds - the error occurs during cache key generation.

✓ Correct usage:

# Use consistent types for each field
db.select(User).filter(name="Alice", age=30).fetch_all()
db.select(User).filter(name="Bob", age=40).fetch_all()  # Works fine

This limitation only affects queries with incompatible value types for the same field. Normal filtering patterns (strings, numbers, dates used consistently) work without issues.

Cache Statistics

Track cache performance with the get_cache_stats() method:

stats = db.get_cache_stats()
print(stats)
# Output: {"hits": 150, "misses": 50, "total": 200, "hit_rate": 75.0}

hits: Number of times a cached result was returned
misses: Number of times the database was queried (cache miss)
total: Total cache lookups (hits + misses)
hit_rate: Percentage of cache hits (0-100)

Query-Level Controls

Bypassing the Cache

Force a specific query to skip the cache and always fetch fresh data from the database:

# Force fresh data, bypassing cache
user = db.select(User).filter(name="Alice").bypass_cache().fetch_one()

Use cases:

Fetching real-time data that may have changed externally
Debugging queries to verify database content
Admin operations that require absolute freshness

Per-Query TTL

Override the global cache_ttl setting for a specific query:

# Cache this result for 5 minutes (300 seconds)
users = db.select(User).cache_ttl(300).fetch_all()

Use cases:

Shorter TTL for frequently changing data
Longer TTL for rarely changing reference data
Different TTL requirements for different query types

Empty Result Caching

Both None and empty list [] results are cached:

# Query that returns no result (fetch_one)
result = db.select(User).filter(name="NonExistent").fetch_one()
# Returns: None (cached)

# Query that returns no results (fetch_all)
results = db.select(User).filter(age__gt=200).fetch_all()
# Returns: [] (cached)

Memory Limiting

Use cache_max_memory_mb to prevent the cache from consuming too much memory:

# Limit cache to 10MB per table
db = SqliterDB("mydb.db", cache_enabled=True, cache_max_memory_mb=10)

When the memory limit is exceeded:

The cache evicts the least recently used entries
Continues evicting until there's room for the new entry
Memory usage is tracked per table

Best Practices

When to Enable Caching

Caching is beneficial when:

You have read-heavy workloads
The same queries are executed repeatedly
Data doesn't change frequently
Query execution time is significant

When to Bypass Cache

Use bypass_cache() when:

You need guaranteed fresh data
Debugging data consistency issues
Performing administrative operations

TTL Guidelines

Short TTL (1-60 seconds): Frequently changing data, user sessions
Medium TTL (1-10 minutes): Moderately changing data, dashboards
Long TTL (10+ minutes): Rarely changing data, configuration/reference data
No TTL: Cache until invalidated by writes

Memory Limits

Set cache_max_memory_mb when:

Running in memory-constrained environments
Caching large result sets
Preventing cache from growing unbounded

Trade-offs

Advantages

Reduced database load: Fewer queries hit the database
Faster response times: Cached results return instantly
Lower CPU usage: Avoid query parsing and execution overhead
Better scalability: Handle more read requests with same resources

Disadvantages

Memory consumption: Cached data occupies RAM
Stale data risk: Cache may serve outdated data until invalidated
Complexity: Additional configuration and monitoring
Memory overhead: Each cached entry uses memory for objects and metadata

Recommendations

Start with caching disabled, enable only when needed
Monitor cache statistics to measure effectiveness
Use memory limits to prevent unbounded growth
Set appropriate TTLs based on data change frequency
Profile before and after to measure performance gains