You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Configerator based PlanLoader implementation (pytorch#3356)
Summary:
Pull Request resolved: pytorch#3356
Add ConfigeratorPlanLoader an implementation of the PlanLoader interface to enable:
**Key Features:**
1. Plan Retrieval: Loads compressed sharding plans from Configerator using plan_id
2. Database Integration: Queries PlannerStatsDB to get storage location and context hash
3. Decompression: Uses zstd to decompress stored plan data
4. Thrift Conversion: Deserializes Thrift structures and converts back to Python ShardingOption objects
5. Error Handling: Failure scenarios with configurable fallback behavior
**Error Handling & Fallback Scenarios:**
The implementation supports two distinct error handling modes controlled by `enable_fallback`:
**Normal Mode (enable_fallback=False - Default):**
- Raises `PlannerError` with `PLAN_LOADING_FAILED` type for any failure
- Error scenarios include:
- Network connectivity issues (Configerator service unavailable)
- Invalid plan id or config path
- Data decompression failures
- Thrift deserialization errors
- Thrift-to-Python conversion failures
**Fallback Mode (enable_fallback=True):**
- Returns `None` instead of raising exceptions on loading failures
- Logs detailed warning messages with plan_id, config_path, and error details
- Enables graceful degradation where system can fall back to alternative planning strategies
- Suitable for development, experimentation, or scenarios prioritizing availability over strict error handling
- Warning logs include full context for debugging: plan ID, Configerator path, and original error
Reviewed By: mserturk
Differential Revision: D81573577
fbshipit-source-id: 93e84c86fb0b9bccd443a93e2b5785e1bc06a349
0 commit comments