Skip to content

Commit 83668ea

Browse files
committed
For vLLM health check, using docker service name instead to host_ip
Signed-off-by: Tsai, Louie <louie.tsai@intel.com>
1 parent 1315a6b commit 83668ea

File tree

3 files changed

+19
-2
lines changed

3 files changed

+19
-2
lines changed

ChatQnA/docker_compose/intel/cpu/xeon/compose.perf.yaml

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,20 @@ services:
88
VLLM_CPU_SGL_KERNEL: 1
99
entrypoint: ["python3", "-m", "vllm.entrypoints.openai.api_server"]
1010
command: --model $LLM_MODEL_ID --host 0.0.0.0 --port 80 --dtype bfloat16 --distributed-executor-backend mp --block-size 128 --enforce-eager --tensor-parallel-size $TP_NUM --pipeline-parallel-size $PP_NUM --max-num-batched-tokens $MAX_BATCHED_TOKENS --max-num-seqs $MAX_SEQS
11+
vllm-ci-test:
12+
image: public.ecr.aws/q9t5s3a7/vllm-ci-test-repo:a5dd03c1ebc5e4f56f3c9d3dc0436e9c582c978f-cpu
13+
container_name: vllm-ci-test
14+
volumes:
15+
- "${MODEL_CACHE:-./data}:/root/.cache/huggingface/hub"
16+
shm_size: 128g
17+
environment:
18+
no_proxy: ${no_proxy}
19+
http_proxy: ${http_proxy}
20+
https_proxy: ${https_proxy}
21+
HF_TOKEN: ${HF_TOKEN}
22+
LLM_MODEL_ID: ${LLM_MODEL_ID}
23+
VLLM_CPU_KVCACHE_SPACE: 40
24+
ON_CPU: 1
25+
REMOTE_HOST: vllm-service
26+
REMOTE_PORT: 80
27+
entrypoint: tail -f /dev/null

ChatQnA/docker_compose/intel/cpu/xeon/compose.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@ services:
106106
HF_HUB_OFFLINE: ${HF_HUB_OFFLINE:-0}
107107
VLLM_CPU_KVCACHE_SPACE: 40
108108
healthcheck:
109-
test: ["CMD-SHELL", "curl -f http://$host_ip:9009/health || exit 1"]
109+
test: ["CMD-SHELL", "curl -f http://vllm-service:80/health || exit 1"]
110110
interval: 10s
111111
timeout: 10s
112112
retries: 100

ChatQnA/docker_compose/intel/cpu/xeon/set_env.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ export HF_TOKEN=${HF_TOKEN}
1313
export host_ip=${ip_address}
1414
export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
1515
export RERANK_MODEL_ID="BAAI/bge-reranker-base"
16-
export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct"
16+
export LLM_MODEL_ID="meta-llama/Meta-Llama-3.1-8B-Instruct"
1717
export INDEX_NAME="rag-redis"
1818
# Set it as a non-null string, such as true, if you want to enable logging facility,
1919
# otherwise, keep it as "" to disable it.

0 commit comments

Comments
 (0)