[RoadMap][Call For Contributions] Mooncake Store V3 Roadmap

### Milestone 1: Core Architecture Refactor & Decoupling
This milestone focuses on foundational architectural changes to improve modularity, flexibility, and prepare for future scaling.

- [ ] (TE/Store Separation): Decouple the TE (Task/Tensor Engine) and Store components into separate, independent packages.
- [ ] (Client/Worker Decoupling): Decouple the dummy client from the worker to remove strong dependencies. 
- [ ] (Flexible Deployment): Update the Store to support various flexible deployment models, such as client-only, client + master, etc. 
- [ ] (Tensor-native APIs): Put/Get Tensor APIs contains TP rank and model info.

###  Milestone 2: Master Service Enhancements
This milestone enhances the Master component to support new storage architectures and routing logic.

- [ ] (Key-based Routing): Implement new key-based routing capabilities in the Master service. 
- [ ] (Metadata Adaptation - Storage): Adapt the Master's metadata management to support the new multi-level storage architecture.
- [ ] (Recovery) kv metadata persistency 
- [ ] (KVCache Awareness Interface) Exposes hit ratio for different layers.
- [ ] (Metadata Adaptation - HA): Upgrade metadata schema and logic to meet new High Availability (HA) requirements. 
- [ ] (Multi-tenant): Support Multi-tenant with different models, users and auth keys

### Milestone 3: Worker: Multi-Level Storage Architecture
This is a major epic to build the next-generation multi-level storage system within the Worker.

- 3.1: Abstraction & Caching
  - [ ] (Storage Abstraction Layer): Design and implement the core abstraction layer for multi-level storage.  
  - [ ] (Cache Scheduling Interface): Design the abstract interface for cache scheduling logic. 
  - [ ] (Eviction Logic): Implement basic data eviction logic within the new storage architecture. #1028 
  - [ ] (LRU Cache): Implement an LRU (Least Recently Used) policy as the default cache scheduling strategy. 
  - [ ] (Local Client Cache): Keep a local cache for better performance. #1062 

- 3.2: Storage Backend Implementation
  - [ ] (DRAM Adaptation): Adapt the storage layer for DRAM, including support for NUMA affinity. 
  - [ ] (SSD Adaptation): Adapt the storage layer for SSDs, enabling local external storage read/write capabilities.   https://github.com/kvcache-ai/Mooncake/issues/1054
  - [ ] (VRAM Adaptation): Adapt the storage layer to utilize VRAM. 
  - [ ] (Huawei NPU Adaptation): Implement support for Huawei NPUs (H2D). 

- 3.3: Elastic KVCache Storage

### Milestone 4: Worker: Networking & Elasticity
This milestone focuses on refactoring worker communication and enabling resource elasticity.

- [ ] (RPC Refactor): [Phase 1] Refactor the worker's read/write logic to replace RDMA with RPC-based communication. 
- [ ] (Barex Transport Support): Support Alibaba barex transport in TE for Mooncake Store. #1045 
- [ ] (Resource Elasticity): Implement single-worker resource elasticity. 
- [ ] (Event‑driven completion): Provide an option to using event-driven notification worker instead of busy-polling. #1033 #1053
- [ ] (IPv6 Support): Support IPv6 in client, master and metadata server. #1043 #1067

###  Milestone 5: Deployment & Operations
This milestone covers K8s integration (i.e., RBG, https://github.com/sgl-project/rbg) and build process improvements.

- [ ] (K8s Autoscaling): Implement support for Kubernetes-based autoscaling of worker and dummy client instances.
- [ ] (Scenario-based Builds): Implement a build system capable of producing different worker binaries optimized for different scenarios. 
- [ ] (Integration With AI Configurator): Use AI Configurator for better measuring Resource workers and other configurations.
- [ ] (Deployment Documentation & Guides): Create comprehensive, up-to-date deployment documentation and step-by-step setup guides to simplify installation and configuration for all environments.

### Milestone 6: CI & CD enhancement
- [ ] (End-to-end CI tests): For SGLang, support Hicache, PD, Elatics EP, checkpoint engine tests.

### Milestone 7: Performance & Benchmarks
- [ ] (Store Master Benchmark): Design and integrate a dedicated benchmark for the Mooncake store master module to evaluate throughput, latency, and scalability.

-----
Thanks for being a part of the Mooncake community! Welcome to ****discuss**** and contribute! 

-----
**If you have any ideas, just leave a comment below and help shape the Roadmap.**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RoadMap][Call For Contributions] Mooncake Store V3 Roadmap #1035

Milestone 1: Core Architecture Refactor & Decoupling

Milestone 2: Master Service Enhancements

Milestone 3: Worker: Multi-Level Storage Architecture

Milestone 4: Worker: Networking & Elasticity

Milestone 5: Deployment & Operations

Milestone 6: CI & CD enhancement

Milestone 7: Performance & Benchmarks

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RoadMap][Call For Contributions] Mooncake Store V3 Roadmap #1035

Description

Milestone 1: Core Architecture Refactor & Decoupling

Milestone 2: Master Service Enhancements

Milestone 3: Worker: Multi-Level Storage Architecture

Milestone 4: Worker: Networking & Elasticity

Milestone 5: Deployment & Operations

Milestone 6: CI & CD enhancement

Milestone 7: Performance & Benchmarks

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions