-
Notifications
You must be signed in to change notification settings - Fork 606
[Feature] Support ep pd with external module #3128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
[Feature] Support ep pd with external module #3128
Conversation
Thanks for your contribution! |
另外此处少一个CI监控,需要增加类似 https://github.com/PaddlePaddle/FastDeploy/tree/develop/test/ci_use/EB_Lite 的测试,覆盖新接口的使用 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for EP (expert) PD (prefill-decode) disaggregated deployment with external module support. It introduces infrastructure for external modules to dispatch tasks directly to P and D instances using TCP-based ZMQ communication and a new "dp" scheduler alongside the existing splitwise scheduler.
- Adds ZmqTcpServer/ZmqIpcServer for TCP/IPC communication modes controlled by FD_ENABLE_INTERNAL_ADAPTER
- Introduces DPScheduler and DPLocalScheduler for external module task dispatch
- Implements InternalAdapter for external module control commands (get_payload, get_metrics, connect_rdma)
Reviewed Changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 6 comments.
Show a summary per file
File | Description |
---|---|
fastdeploy/splitwise/internal_adapter_utils.py |
New InternalAdapter class for handling external module control commands |
fastdeploy/scheduler/dp_scheduler.py |
New DPScheduler and DPLocalScheduler for distributed processing |
fastdeploy/scheduler/config.py |
Added DPLocalSchedulerConfig and dp scheduler support |
fastdeploy/inter_communicator/zmq_server.py |
New ZMQ server implementations for TCP and IPC communication |
fastdeploy/inter_communicator/zmq_client.py |
Refactored ZMQ client with base class and IPC implementation |
fastdeploy/inter_communicator/engine_worker_queue.py |
Added RDMA connection task queues and management |
fastdeploy/inter_communicator/__init__.py |
Updated imports for new ZMQ classes |
fastdeploy/envs.py |
Added environment variables for internal adapter configuration |
fastdeploy/entrypoints/engine_client.py |
Updated to use ZmqIpcClient instead of ZmqClient |
fastdeploy/engine/expert_service.py |
Added support for dp scheduler and internal adapter |
fastdeploy/engine/engine.py |
Enhanced with TCP/IPC server selection and dp scheduler support |
fastdeploy/engine/args_utils.py |
Added splitwise_role to scheduler config fields |
fastdeploy/cache_manager/cache_transfer_manager.py |
Added data_parallel_size parameter |
fastdeploy/cache_manager/cache_messager.py |
Enhanced with RDMA connection handling and data parallel support |
root seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
…om/PaddlePaddle/FastDeploy into support_ep_pd_with_external_module
…om/rainyfly/FastDeploy into support_ep_pd_with_external_module
…into support_ep_pd_with_external_module
Description
If we want to run PD disaggregated deployment in FD, we should use splitwise scheduler to distribute task and use redis to synchronize instance meta info、user request and generated results. The dispatch for task is inside LLMEngine by spiltwise scheduler after recieved request.
We also want to support external module to dispatch tasks. The external module will dispatch task for P and D instance, send request to the scheduled LLMEngine directly and receive response.