Skip to content
This repository was archived by the owner on Nov 14, 2025. It is now read-only.
This repository was archived by the owner on Nov 14, 2025. It is now read-only.

perf: Introduce lightweight caching with TTL for hot paths #154

@sapientpants

Description

@sapientpants

Feature: Lightweight Caching with TTL

Business Value

Significantly reduce API call volume and improve response times by implementing intelligent caching for frequently accessed data. This feature will decrease latency for end users, reduce load on the DeepSource API, and improve resilience during API rate limiting or temporary outages. For users with large project portfolios accessing the same data repeatedly, this can reduce API calls by up to 80% for hot paths.

User Story

As a DeepSource MCP server user
I want frequently accessed data to be cached with configurable TTL
So that I experience faster response times and avoid unnecessary API calls while maintaining data freshness

Gherkin Specification

Feature: Lightweight Caching with TTL
  Implement in-memory LRU caching for hot paths with configurable TTL
  and per-request memoization to optimize API usage and response times.

  Background:
    Given the DeepSource MCP server is configured
    And caching is enabled via environment variables
    And cache TTL is set to 300 seconds (5 minutes)

  Scenario: Cache hit for project list
    Given the project list was fetched 2 minutes ago
    When I request the project list again
    Then the cached data should be returned immediately
    And no API call should be made to DeepSource
    And the response time should be under 10ms
    And cache statistics should show a cache hit

  Scenario: Cache miss after TTL expiration
    Given the project list was cached 6 minutes ago
    When I request the project list
    Then the cache entry should be considered stale
    And a fresh API call should be made
    And the new data should be cached with updated timestamp
    And cache statistics should show a cache miss

  Scenario: Per-request memoization for concurrent identical calls
    Given 5 concurrent requests for the same project's issues
    When all requests are initiated simultaneously
    Then only 1 API call should be made to DeepSource
    And all 5 requests should receive the same response
    And the response should be cached for subsequent requests
    And request deduplication should be logged

  Scenario: LRU eviction when cache is full
    Given the cache size limit is 100 entries
    And the cache contains 100 entries
    When a new unique request is made
    Then the least recently used entry should be evicted
    And the new entry should be added to the cache
    And cache statistics should show the eviction

  Scenario: Cache invalidation on write operations
    Given project metrics are cached
    When I update a metric threshold
    Then the cached metrics for that project should be invalidated
    And the next read should fetch fresh data
    And related cache entries should also be invalidated

  Scenario Outline: Configurable cache behavior per endpoint
    Given endpoint <endpoint> with cache config <config>
    When I request data from <endpoint>
    Then caching behavior should be <behavior>

    Examples:
      | endpoint                    | config         | behavior                          |
      | projects                    | ttl=300s       | cache for 5 minutes               |
      | project_issues              | ttl=60s        | cache for 1 minute                |
      | quality_metrics             | ttl=120s       | cache for 2 minutes               |
      | dependency_vulnerabilities  | ttl=600s       | cache for 10 minutes              |
      | update_metric_threshold     | no-cache       | never cache (write operation)     |

  Scenario: Cache disabled via environment variable
    Given CACHE_ENABLED is set to "false"
    When I make any request
    Then no caching should occur
    And all requests should go directly to the API
    And cache statistics should show all operations bypassed

  Scenario: Cache key generation with parameters
    Given a request for issues with filter "analyzer=python"
    And another request for issues with filter "analyzer=javascript"
    When both requests are made
    Then they should have different cache keys
    And both responses should be cached separately
    And cache lookups should respect the parameters

  Scenario: Cache warmup on startup
    Given CACHE_WARMUP_ENABLED is set to "true"
    When the server starts
    Then it should prefetch the project list
    And prefetch recent runs for each project
    And log the warmup completion time
    And subsequent requests should hit the warm cache

  Scenario: Cache statistics and monitoring
    Given the cache is actively being used
    When I request cache statistics
    Then I should see:
      | Metric              | Description                        |
      | hit_rate            | Percentage of cache hits           |
      | miss_rate           | Percentage of cache misses         |
      | eviction_count      | Number of LRU evictions            |
      | avg_entry_size      | Average size of cached entries     |
      | total_memory_usage  | Total memory used by cache         |
      | entries_count       | Current number of cached entries   |
      | requests_saved      | Number of API calls avoided        |

  Scenario: Graceful degradation on cache failure
    Given the cache encounters a memory error
    When a request is made
    Then the request should proceed without caching
    And an error should be logged
    And the response should still be returned
    And monitoring should be alerted

  Scenario: Cache serialization for complex objects
    Given a complex GraphQL response with nested data
    When the response is cached
    Then all data types should be preserved correctly
    And dates should maintain their format
    And branded types should remain intact
    And the deserialized data should match the original

  Scenario: TTL refresh on access (sliding window)
    Given CACHE_TTL_STRATEGY is set to "sliding"
    And a cache entry was accessed 4 minutes ago (TTL=5min)
    When the same entry is accessed again
    Then the TTL should reset to 5 minutes from now
    And the entry should remain in cache for another 5 minutes

Acceptance Criteria

  • In-memory LRU cache implementation with configurable size
  • TTL support with both fixed and sliding window strategies
  • Per-request memoization to deduplicate concurrent identical calls
  • Cache key generation based on:
    • Endpoint/method name
    • Request parameters (filters, pagination, etc.)
    • User context (API key hash)
  • Configurable cache behavior via environment variables:
    • CACHE_ENABLED (true/false, default: true)
    • CACHE_DEFAULT_TTL_SECONDS (default: 300)
    • CACHE_MAX_ENTRIES (default: 1000)
    • CACHE_MAX_MEMORY_MB (default: 100)
    • CACHE_TTL_STRATEGY (fixed/sliding, default: fixed)
    • CACHE_WARMUP_ENABLED (true/false, default: false)
  • Per-endpoint cache configuration:
    • Projects: 5 minutes
    • Issues: 1 minute
    • Metrics: 2 minutes
    • Vulnerabilities: 10 minutes
    • Runs: 30 seconds
  • Cache invalidation on write operations
  • Cache statistics and monitoring
  • Memory-efficient storage with size limits
  • Thread-safe cache operations
  • No external dependencies (pure TypeScript implementation)

Non-Goals

  • Will NOT implement distributed caching (Redis, Memcached)
  • Will NOT persist cache to disk
  • Will NOT cache sensitive authentication data
  • Will NOT implement cache preloading from external sources
  • Will NOT provide cache synchronization across multiple server instances
  • Out of scope: Response streaming with partial cache hits
  • Will NOT implement cache compression
  • Will NOT cache GraphQL mutations or write operations

Risks & Mitigations

  • Risk: Stale data being served to users
    Mitigation: Conservative TTL defaults and clear documentation about cache behavior

  • Risk: Memory exhaustion from unbounded cache growth
    Mitigation: LRU eviction policy and configurable memory limits

  • Risk: Cache key collisions causing incorrect data retrieval
    Mitigation: Comprehensive key generation including all relevant parameters

  • Risk: Complex cache invalidation logic causing bugs
    Mitigation: Simple invalidation patterns and extensive testing

  • Risk: Performance degradation from cache overhead
    Mitigation: Lightweight implementation and ability to disable caching

Technical Considerations

  • Architecture impact:

    • Add caching layer between handlers and clients
    • Implement cache interceptor pattern
    • Create cache key generation utilities
    • Add cache statistics collector
  • Performance considerations:

    • O(1) cache lookups via Map
    • O(log n) LRU management via linked list
    • Minimal serialization overhead
    • Memory monitoring to prevent leaks
  • Implementation approach:

    • Create CacheManager class with LRU logic
    • Implement CacheInterceptor for transparent caching
    • Add @Cacheable decorator for methods
    • Use WeakMap for per-request memoization
    • Leverage existing logger for cache events
  • Memory management:

    • Use Map for O(1) lookups
    • Doubly linked list for LRU tracking
    • Lazy eviction on capacity exceeded
    • Size estimation for complex objects

Testing Requirements

  • Unit tests for cache operations (get, set, evict)
  • Integration tests with real API calls
  • Property-based tests for:
    • LRU eviction correctness
    • TTL expiration accuracy
    • Cache key uniqueness
  • Performance tests:
    • Cache hit/miss latency
    • Memory usage under load
    • Concurrent access patterns
  • Stress tests for memory limits
  • Test cache invalidation scenarios
  • Verify cache statistics accuracy
  • Test configuration variations

Definition of Done

  • LRU cache implementation complete
  • TTL mechanisms working correctly
  • Per-request memoization functional
  • All endpoints properly cached
  • Cache invalidation implemented
  • Configuration via environment variables
  • Cache statistics and monitoring available
  • All tests passing with >90% coverage
  • Documentation includes caching guide
  • Performance benchmarks show >50% reduction in API calls
  • Memory usage stays within configured limits
  • No memory leaks detected
  • Code reviewed and approved

Implementation Notes

  1. Cache Manager Interface:

    interface CacheEntry<T> {
      data: T;
      timestamp: number;
      ttl: number;
      accessCount: number;
      size: number;
    }
    
    interface CacheConfig {
      maxEntries: number;
      maxMemoryMB: number;
      defaultTTL: number;
      ttlStrategy: 'fixed' | 'sliding';
    }
  2. Cache Key Generation:

    function generateCacheKey(
      method: string,
      params: Record<string, unknown>,
      context: { apiKeyHash: string }
    ): string {
      return `${context.apiKeyHash}:${method}:${hash(params)}`;
    }
  3. Decorator Pattern:

    @Cacheable({ ttl: 300 })
    async listProjects(): Promise<Project[]> {
      // Method implementation
    }
  4. Cache Layers:

    • L1: Per-request memoization (request lifecycle)
    • L2: LRU cache with TTL (server lifecycle)
    • L3: API calls (fallback)
  5. Invalidation Patterns:

    • Write operations invalidate related reads
    • Hierarchical invalidation (project → issues → runs)
    • Tag-based invalidation for grouped entries
  6. Monitoring Integration:

    • Export Prometheus metrics
    • Log cache events at DEBUG level
    • Alert on low hit rates (<30%)

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestperformancePerformance improvements and optimizations

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions