Skip to content

Hebrew Locale Corrupts Subsequent Locale Formatting #1234

@kodzi

Description

@kodzi

Summary

Hebrew locale ('he') corrupts Babel's internal cache, causing subsequent format_date() calls with other locales to return Hebrew-formatted text instead of the requested locale.

Environment

  • Babel version: 2.16.0
  • Python version: 3.12.7
  • Platform: macOS 26.0.1 (arm64)
  • Architecture: 64bit

Description

When using babel.dates.format_date() with Hebrew locale ('he'), subsequent calls with other locales (specifically 'no', 'fr', and others) return Hebrew text instead of the requested locale formatting. This corruption persists throughout the Python session until the process is restarted.

Steps to Reproduce

Minimal Reproduction Case

from babel.dates import format_date
from datetime import datetime

# Create a datetime object
date_obj = datetime.utcnow()

# Test German locale (works correctly)
print("German before Hebrew:", format_date(date_obj, 'LLLL', 'de'))
# Output: "Oktober"

# Use Hebrew locale (works correctly)  
print("Hebrew:", format_date(date_obj, 'LLLL', 'he'))
# Output: "אוקטובר"

# Test Norwegian locale (CORRUPTED - returns Hebrew!)
print("Norwegian after Hebrew:", format_date(date_obj, 'LLLL', 'no'))
# Expected: "oktober"
# Actual: "אוקטובר" ❌

# Test French locale (CORRUPTED - returns Hebrew!)
print("French after Hebrew:", format_date(date_obj, 'LLLL', 'fr'))
# Expected: "octobre" 
# Actual: "אוקטובר" ❌

# Test German locale (still works correctly)
print("German after Hebrew:", format_date(date_obj, 'LLLL', 'de'))
# Output: "Oktober" ✅

Expected vs Actual Behavior

  • Expected: Each locale should return text in its own language
  • Actual: After Hebrew usage, certain locales return Hebrew text

Analysis

Cache Investigation

The issue is related to Babel's internal locale data cache (babel.localedata._cache):

import babel.localedata

# Before any locale usage
print("Initial cache:", list(babel.localedata._cache.keys()))
# Output: []

# After using German
format_date(date_obj, 'LLLL', 'de')
print("After German:", list(babel.localedata._cache.keys()))
# Output: ['root', 'de']

# After using Hebrew  
format_date(date_obj, 'LLLL', 'he')
print("After Hebrew:", list(babel.localedata._cache.keys()))
# Output: ['root', 'de', 'he']

# After using Norwegian (corrupted)
format_date(date_obj, 'LLLL', 'no')  # Returns Hebrew text!
print("After Norwegian:", list(babel.localedata._cache.keys()))
# Output: ['root', 'de', 'he', 'no']

Workaround

Clearing the cache after Hebrew usage fixes the corruption:

import babel.localedata

# Use Hebrew locale
format_date(date_obj, 'LLLL', 'he')

# Clear cache to prevent corruption
babel.localedata._cache.clear()

# Now other locales work correctly
print(format_date(date_obj, 'LLLL', 'no'))  # Returns "oktober" ✅
print(format_date(date_obj, 'LLLL', 'fr'))  # Returns "octobre" ✅

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions