Skip to content

refactor: Normalize error taxonomy and messages #157

@sapientpants

Description

@sapientpants

Feature: Normalized Error Taxonomy and Messages

Business Value

Establish a consistent, predictable error handling system that improves debugging efficiency, reduces support burden, and enables better monitoring and alerting. By normalizing error taxonomy with clear distinction between authentication and authorization, stable error codes, and consistent message formats, developers and operations teams can quickly identify and resolve issues. This standardization is crucial for enterprise deployments where clear error reporting directly impacts incident response times and system reliability.

User Story

As a developer or operator using the DeepSource MCP server
I want consistent, well-structured error messages with clear categories and codes
So that I can quickly diagnose issues, implement proper error handling, and maintain system reliability

Gherkin Specification

Feature: Normalized Error Taxonomy and Messages
  Implement a unified error system with consistent structure, clear categorization,
  distinct authentication vs authorization errors, and stable machine-readable codes.

  Background:
    Given the DeepSource MCP server with various error scenarios
    And the need for consistent error reporting across all components

  Scenario: Standardized error structure
    Given any error occurs in the system
    When the error is returned to the client
    Then it should contain all required fields:
      | Field    | Type     | Description                        | Example                    |
      | code     | string   | Stable machine-readable code      | "DS_AUTH_INVALID_TOKEN"    |
      | category | enum     | Error category from taxonomy       | "AUTHENTICATION_ERROR"     |
      | message  | string   | Human-readable description        | "Invalid API token"        |
      | cause    | object   | Original error if applicable      | {originalError}            |
      | details  | object   | Additional context                | {endpoint: "projects"}     |
      | retryable| boolean  | Whether retry might succeed       | false                      |
      | timestamp| string   | ISO 8601 timestamp                | "2024-01-15T10:30:00Z"     |

  Scenario: Authentication vs Authorization distinction
    Given a request with invalid credentials
    When the authentication check fails
    Then the error should have:
      | Field    | Value                          |
      | code     | "DS_AUTHN_INVALID_CREDENTIALS"|
      | category | "AUTHENTICATION_ERROR"         |
      | message  | "Invalid or expired API key"   |
    Given a request with valid credentials but insufficient permissions
    When the authorization check fails
    Then the error should have:
      | Field    | Value                          |
      | code     | "DS_AUTHZ_INSUFFICIENT_PERMS" |
      | category | "AUTHORIZATION_ERROR"          |
      | message  | "Insufficient permissions"     |

  Scenario Outline: Error category mapping
    Given an error of type <error_type>
    When the error is classified
    Then it should map to category <category>
    And have code prefix <code_prefix>

    Examples:
      | error_type              | category                | code_prefix  |
      | Invalid API key         | AUTHENTICATION_ERROR    | DS_AUTHN_    |
      | Forbidden resource      | AUTHORIZATION_ERROR     | DS_AUTHZ_    |
      | Rate limit exceeded     | RATE_LIMIT_ERROR        | DS_RATE_     |
      | Network timeout         | NETWORK_ERROR           | DS_NET_      |
      | Invalid parameters      | VALIDATION_ERROR        | DS_VAL_      |
      | Server internal error   | SERVER_ERROR            | DS_SRV_      |
      | Resource not found      | NOT_FOUND_ERROR         | DS_404_      |
      | Bad request format      | CLIENT_ERROR            | DS_CLI_      |
      | GraphQL schema error    | SCHEMA_ERROR            | DS_SCHEMA_   |
      | Configuration missing   | CONFIGURATION_ERROR     | DS_CFG_      |

  Scenario: Stable error codes
    Given the error code "DS_AUTHN_INVALID_TOKEN"
    When this code is used in the system
    Then it should always mean "Invalid authentication token"
    And never be reused for different error conditions
    And be documented in the error catalog

  Scenario: HTTP status code alignment
    Given an error with category <category>
    When returning an HTTP response
    Then the status code should be <status_code>

    Examples:
      | category                | status_code |
      | AUTHENTICATION_ERROR    | 401         |
      | AUTHORIZATION_ERROR     | 403         |
      | NOT_FOUND_ERROR         | 404         |
      | VALIDATION_ERROR        | 400         |
      | RATE_LIMIT_ERROR        | 429         |
      | SERVER_ERROR            | 500         |
      | NETWORK_ERROR           | 502         |

  Scenario: Error message consistency
    Given multiple occurrences of the same error type
    When the errors are generated
    Then all should have:
      | Aspect               | Requirement                           |
      | Code                 | Identical stable code                 |
      | Message template     | Same base message structure           |
      | Category             | Same category classification          |
      | Retryable flag       | Consistent retry recommendation       |

  Scenario: Error context preservation
    Given a chain of errors (A causes B causes C)
    When the final error is returned
    Then the error should contain:
      | Field         | Content                              |
      | message       | High-level user-friendly message    |
      | cause         | The immediate cause (error B)       |
      | cause.cause   | The root cause (error A)            |
      | stackTrace    | Complete stack trace if in dev mode |

  Scenario: Localized error messages
    Given an error with code "DS_VAL_REQUIRED_FIELD"
    When requesting error details
    Then the response should include:
      | Field                | Value                                |
      | code                 | "DS_VAL_REQUIRED_FIELD"             |
      | message              | "Required field missing: {field}"    |
      | userMessage          | "Please provide the {field} field"  |
      | developerMessage     | "Field {field} is required by API"  |

  Scenario: Error catalog generation
    Given the complete error taxonomy
    When generating documentation
    Then it should produce:
      | Output              | Description                           |
      | error-codes.md      | All error codes with descriptions    |
      | error-categories.ts | TypeScript enum of categories        |
      | error-mappings.json | Code to message mappings             |
      | error-examples.md   | Example responses for each category  |

  Scenario: Backward compatibility
    Given existing error handling code
    When migrating to the new taxonomy
    Then old error formats should:
      | Behavior            | Implementation                        |
      | Still work          | Map to new categories automatically  |
      | Log deprecation     | Warn about old format usage          |
      | Provide migration   | Include migration guide in response  |

  Scenario: Error monitoring integration
    Given an error occurs in production
    When the error is logged
    Then it should include:
      | Field               | Purpose                              |
      | code                | Grouping in monitoring dashboard     |
      | category            | Alert routing rules                  |
      | fingerprint         | Deduplication key                   |
      | environment         | Environment context                  |
      | version             | Server version for tracking          |

Acceptance Criteria

  • Unified error structure with required fields:
    • code: Stable, unique error identifier
    • category: Enumerated category from taxonomy
    • message: Human-readable description
    • cause: Original error if applicable
    • details: Additional context object
    • retryable: Boolean retry indicator
    • timestamp: ISO 8601 timestamp
  • Clear distinction between authentication and authorization:
    • AUTHENTICATION_ERROR for identity verification failures
    • AUTHORIZATION_ERROR for permission/access failures
    • Separate error codes for each type
  • Complete error taxonomy with categories:
    • AUTHENTICATION_ERROR (401)
    • AUTHORIZATION_ERROR (403)
    • VALIDATION_ERROR (400)
    • NOT_FOUND_ERROR (404)
    • RATE_LIMIT_ERROR (429)
    • CLIENT_ERROR (400)
    • SERVER_ERROR (500)
    • NETWORK_ERROR (502)
    • TIMEOUT_ERROR (504)
    • SCHEMA_ERROR (400)
    • CONFIGURATION_ERROR (500)
  • Stable error code format: DS_[CATEGORY]_[SPECIFIC]
  • Consistent HTTP status code mapping
  • Error message templates with parameter substitution
  • Error cause chain preservation
  • Backward compatibility with deprecation warnings
  • Error catalog documentation
  • TypeScript types and enums exported
  • Monitoring-friendly error attributes

Non-Goals

  • Will NOT implement error message internationalization (i18n)
  • Will NOT add error recovery mechanisms
  • Will NOT implement custom error pages
  • Will NOT add error analytics or metrics collection
  • Out of scope: Client-side error handling libraries
  • Will NOT modify MCP protocol error codes
  • Will NOT implement error notification systems

Risks & Mitigations

  • Risk: Breaking changes for existing error handling code
    Mitigation: Provide compatibility layer and migration period

  • Risk: Over-categorization making the system complex
    Mitigation: Keep categories minimal and well-defined

  • Risk: Performance impact from error enrichment
    Mitigation: Lazy evaluation of expensive error details

  • Risk: Sensitive information in error messages
    Mitigation: Separate internal and external error messages

Technical Considerations

  • Architecture impact:

    • Consolidate error handling into single module
    • Replace multiple error classes with unified system
    • Update all error generation points
    • Standardize error logging
  • Implementation approach:

    // Unified error structure
    interface NormalizedError {
      code: ErrorCode;
      category: ErrorCategory;
      message: string;
      cause?: Error;
      details?: Record<string, unknown>;
      retryable: boolean;
      timestamp: string;
      stackTrace?: string;
    }
    
    // Clear category enumeration
    enum ErrorCategory {
      AUTHENTICATION_ERROR = 'AUTHENTICATION_ERROR',
      AUTHORIZATION_ERROR = 'AUTHORIZATION_ERROR',
      VALIDATION_ERROR = 'VALIDATION_ERROR',
      NOT_FOUND_ERROR = 'NOT_FOUND_ERROR',
      RATE_LIMIT_ERROR = 'RATE_LIMIT_ERROR',
      CLIENT_ERROR = 'CLIENT_ERROR',
      SERVER_ERROR = 'SERVER_ERROR',
      NETWORK_ERROR = 'NETWORK_ERROR',
      TIMEOUT_ERROR = 'TIMEOUT_ERROR',
      SCHEMA_ERROR = 'SCHEMA_ERROR',
      CONFIGURATION_ERROR = 'CONFIGURATION_ERROR',
    }
    
    // Stable error codes
    enum ErrorCode {
      // Authentication
      DS_AUTHN_MISSING_KEY = 'DS_AUTHN_MISSING_KEY',
      DS_AUTHN_INVALID_KEY = 'DS_AUTHN_INVALID_KEY',
      DS_AUTHN_EXPIRED_KEY = 'DS_AUTHN_EXPIRED_KEY',
      
      // Authorization
      DS_AUTHZ_INSUFFICIENT_PERMS = 'DS_AUTHZ_INSUFFICIENT_PERMS',
      DS_AUTHZ_RESOURCE_FORBIDDEN = 'DS_AUTHZ_RESOURCE_FORBIDDEN',
      
      // ... more codes
    }
  • Migration strategy:

    1. Create new error system alongside existing
    2. Add compatibility adapters
    3. Migrate component by component
    4. Deprecate old system
    5. Remove after grace period
  • Error factory pattern:

    class ErrorFactory {
      static authentication(
        code: string, 
        message: string, 
        details?: unknown
      ): NormalizedError {
        return {
          code: `DS_AUTHN_${code}`,
          category: ErrorCategory.AUTHENTICATION_ERROR,
          message,
          details,
          retryable: false,
          timestamp: new Date().toISOString(),
        };
      }
    }

Testing Requirements

  • Unit tests for error factory methods
  • Integration tests for error propagation
  • Tests for backward compatibility
  • Category classification tests
  • HTTP status code mapping tests
  • Error message template tests
  • Cause chain preservation tests
  • Monitoring attribute tests
  • Documentation generation tests
  • TypeScript type checking tests

Definition of Done

  • Unified error structure implemented
  • All error categories defined and documented
  • Authentication vs Authorization clearly separated
  • Stable error codes assigned and documented
  • Error factory methods created
  • All components using new error system
  • Backward compatibility layer working
  • Error catalog documentation generated
  • TypeScript types exported
  • All tests passing with >95% coverage
  • Migration guide published
  • Performance benchmarks show <1ms overhead
  • No sensitive data in error messages
  • Code reviewed and approved

Implementation Notes

  1. Phased Rollout:

    • Phase 1: Create new error system
    • Phase 2: Add compatibility layer
    • Phase 3: Migrate high-impact components
    • Phase 4: Complete migration
    • Phase 5: Deprecate old system
  2. Error Code Registry:

    const ERROR_REGISTRY = {
      DS_AUTHN_INVALID_KEY: {
        message: 'Invalid API key provided',
        category: ErrorCategory.AUTHENTICATION_ERROR,
        httpStatus: 401,
        retryable: false,
      },
      // ... all error codes
    };
  3. Monitoring Integration:

    function logError(error: NormalizedError) {
      logger.error({
        code: error.code,
        category: error.category,
        fingerprint: generateFingerprint(error),
        environment: process.env.NODE_ENV,
        version: VERSION,
        ...error.details,
      });
    }
  4. Documentation Auto-generation:

    npm run generate:error-docs
    # Outputs: docs/errors/catalog.md
  5. Testing Strategy:

    • Test each error category
    • Verify HTTP status mappings
    • Check message consistency
    • Validate TypeScript types
    • Test monitoring attributes

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestrefactoringCode refactoring and cleanup

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions