diff --git a/FinanceAgent/docker_compose/amd/gpu/rocm/README.md b/FinanceAgent/docker_compose/amd/gpu/rocm/README.md new file mode 100644 index 0000000000..277d3f02ac --- /dev/null +++ b/FinanceAgent/docker_compose/amd/gpu/rocm/README.md @@ -0,0 +1,188 @@ +# Example Finance Agent deployments on AMD GPU (ROCm) + +This document outlines the deployment process for a Finance Agent application utilizing OPEA components on an AMD GPU server. + +This example includes the following sections: + +- [Finance Agent Quick Start Deployment](#finance-agent-quick-start-deployment): Demonstrates how to quickly deploy a Finance Agent application/pipeline on AMD GPU platform. +- [Finance Agent Docker Compose Files](#finance-agent-docker-compose-files): Describes some example deployments and their docker compose files. +- [How to interact with the agent system with UI](#how-to-interact-with-the-agent-system-with-ui): Guideline for UI usage + +## Finance Agent Quick Start Deployment + +This section describes how to quickly deploy and test the Finance Agent service manually on an AMD GPU platform. The basic steps are: + +1. [Access the Code](#access-the-code) +2. [Generate a HuggingFace Access Token](#generate-a-huggingface-access-token) +3. [Deploy the Services Using Docker Compose](#deploy-the-services-using-docker-compose) +4. [Check the Deployment Status](#check-the-deployment-status) +5. [Test the Pipeline](#test-the-pipeline) +6. [Cleanup the Deployment](#cleanup-the-deployment) + +### Access the Code + +Clone the GenAIExample repository and access the ChatQnA AMD GPU platform Docker Compose files and supporting scripts: + +``` +mkdir /path/to/your/workspace/ +export WORKDIR=/path/to/your/workspace/ +cd $WORKDIR +git clone https://github.com/opea-project/GenAIExamples.git +``` + +Checkout a released version, such as v1.4: + +``` +git checkout v1.4 +``` + +### Generate a HuggingFace Access Token + +Some HuggingFace resources, such as some models, are only accessible if you have an access token. If you do not already have a HuggingFace access token, you can create one by first creating an account by following the steps provided at [HuggingFace](https://huggingface.co/) and then generating a [user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token). + +### Deploy the Services Using Docker Compose + +#### 3.1 Launch vllm endpoint + +Below is the command to launch a vllm endpoint on Gaudi that serves `meta-llama/Llama-3.3-70B-Instruct` model on AMD ROCm platform. + +```bash +cd $WORKDIR/GenAIExamples/FinanceAgent/docker_compose/amd/gpu/rocm +bash launch_vllm.sh +``` + +#### 3.2 Prepare knowledge base + +The commands below will upload some example files into the knowledge base. You can also upload files through UI. + +First, launch the redis databases and the dataprep microservice. + +```bash +# inside $WORKDIR/GenAIExamples/FinanceAgent/docker_compose/amd/gpu/rocm +bash launch_dataprep.sh +``` + +Validate datat ingest data and retrieval from database: + +```bash +python $WORKPATH/tests/test_redis_finance.py --port 6007 --test_option ingest +python $WORKPATH/tests/test_redis_finance.py --port 6007 --test_option get +``` + +#### 3.3 Launch the multi-agent system + +The command below will launch 3 agent microservices, 1 docsum microservice, 1 UI microservice. + +```bash +# inside $WORKDIR/GenAIExamples/FinanceAgent/docker_compose/amd/gpu/rocm +bash launch_agents.sh +``` + +#### 3.4 Check the Deployment Status + +After running docker compose, check if all the containers launched via docker compose have started: + +``` +docker ps -a +``` + +For the default deployment, the following 5 containers should have started: + +``` +CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES +7e61978c3d75 opea/dataprep:latest "sh -c 'python $( [ …" 31 seconds ago Up 19 seconds 0.0.0.0:6007->5000/tcp, [::]:6007->5000/tcp dataprep-redis-server-finance +0fee87aca791 redis/redis-stack:7.2.0-v9 "/entrypoint.sh" 3 hours ago Up 3 hours (healthy) 0.0.0.0:6380->6379/tcp, [::]:6380->6379/tcp, 0.0.0.0:8002->8001/tcp, [::]:8002->8001/tcp redis-kv-store +debd549045f8 redis/redis-stack:7.2.0-v9 "/entrypoint.sh" 3 hours ago Up 3 hours (healthy) 0.0.0.0:6379->6379/tcp, :::6379->6379/tcp, 0.0.0.0:8001->8001/tcp, :::8001->8001/tcp redis-vector-db +9cff469364d3 ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 "/bin/sh -c 'apt-get…" 3 hours ago Up 3 hours (healthy) 0.0.0.0:10221->80/tcp, [::]:10221->80/tcp tei-embedding-serving +13f71e678dbd opea/vllm-rocm:latest "python3 /workspace/…" 3 hours ago Up 3 hours (healthy) 0.0.0.0:8086->8011/tcp, [::]:8086->8011/tcp vllm-service +e5a219a77c95 opea/llm-docsum:latest "bash entrypoint.sh" 3 hours ago Up 2 seconds 0.0.0.0:33218->9000/tcp, [::]:33218->9000/tcp docsum-llm-server +``` + +### 3.5 Validate agents + +FinQA Agent: + +```bash +export agent_port="9095" +prompt="What is Gap's revenue in 2024?" +python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port +``` + +Research Agent: + +```bash +export agent_port="9096" +prompt="generate NVDA financial research report" +python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port --tool_choice "get_current_date" --tool_choice "get_share_performance" +``` + +Supervisor Agent single turns: + +```bash +export agent_port="9090" +python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --stream +``` + +Supervisor Agent multi turn: + +```bash +python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --multi-turn --stream + +``` + +### Cleanup the Deployment + +To stop the containers associated with the deployment, execute the following commands: + +``` +docker compose -f compose.yaml down +docker compose -f compose_vllm.yaml down +docker compose -f dataprep_compose.yaml down +``` + +All the Finance Agent containers will be stopped and then removed on completion of the "down" command. + +## Finance Agent Docker Compose Files + +In the context of deploying a Finance Agent pipeline on an AMD GPU platform, we can pick and choose different large language model serving frameworks. The table below outlines the various configurations that are available as part of the application. + +| File | Description | +| ------------------------------------------------ | ------------------------------------------------------------------------------------- | +| [compose.yaml](./compose.yaml) | Default compose to run agent service | +| [compose_vllm.yaml](./compose_vllm.yaml) | The LLM Service serving framework is vLLM. | +| [dataprep_compose.yaml](./dataprep_compose.yaml) | Compose file to run Data Prep service such as Redis vector DB, Re-rancer and Embedder | + +## How to interact with the agent system with UI + +The UI microservice is launched in the previous step with the other microservices. +To see the UI, open a web browser to `http://${ip_address}:5175` to access the UI. Note the `ip_address` here is the host IP of the UI microservice. + +1. Create Admin Account with a random value + +2. Enter the endpoints in the `Connections` settings + + First, click on the user icon in the upper right corner to open `Settings`. Click on `Admin Settings`. Click on `Connections`. + + Then, enter the supervisor agent endpoint in the `OpenAI API` section: `http://${ip_address}:9090/v1`. Enter the API key as "empty". Add an arbitrary model id in `Model IDs`, for example, "opea_agent". The `ip_address` here should be the host ip of the agent microservice. + + Then, enter the dataprep endpoint in the `Icloud File API` section. You first need to enable `Icloud File API` by clicking on the button on the right to turn it into green and then enter the endpoint url, for example, `http://${ip_address}:6007/v1`. The `ip_address` here should be the host ip of the dataprep microservice. + + You should see screen like the screenshot below when the settings are done. + +![opea-agent-setting](../../../../assets/ui_connections_settings.png) + +3. Upload documents with UI + + Click on the `Workplace` icon in the top left corner. Click `Knowledge`. Click on the "+" sign to the right of `Icloud Knowledge`. You can paste an url in the left hand side of the pop-up window, or upload a local file by click on the cloud icon on the right hand side of the pop-up window. Then click on the `Upload Confirm` button. Wait till the processing is done and the pop-up window will be closed on its own when the data ingestion is done. See the screenshot below. + + Note: the data ingestion may take a few minutes depending on the length of the document. Please wait patiently and do not close the pop-up window. + +![upload-doc-ui](../../../../assets/upload_doc_ui.png) + +4. Test agent with UI + + After the settings are done and documents are ingested, you can start to ask questions to the agent. Click on the `New Chat` icon in the top left corner, and type in your questions in the text box in the middle of the UI. + + The UI will stream the agent's response tokens. You need to expand the `Thinking` tab to see the agent's reasoning process. After the agent made tool calls, you would also see the tool output after the tool returns output to the agent. Note: it may take a while to get the tool output back if the tool execution takes time. + +![opea-agent-test](../../../../assets/opea-agent-test.png) diff --git a/FinanceAgent/docker_compose/amd/gpu/rocm/compose.yaml b/FinanceAgent/docker_compose/amd/gpu/rocm/compose.yaml new file mode 100644 index 0000000000..45803bf2b1 --- /dev/null +++ b/FinanceAgent/docker_compose/amd/gpu/rocm/compose.yaml @@ -0,0 +1,132 @@ +# Copyright (C) 2025 Advanced Micro Devices, Inc. +# SPDX-License-Identifier: Apache-2.0 + +services: + worker-finqa-agent: + image: opea/agent:latest + container_name: finqa-agent-endpoint + volumes: + - ${TOOLSET_PATH}:/home/user/tools/ + - ${PROMPT_PATH}:/home/user/prompts/ + ports: + - "9095:9095" + ipc: host + environment: + ip_address: ${ip_address} + strategy: react_llama + with_memory: false + recursion_limit: ${recursion_limit_worker} + llm_engine: vllm + HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + llm_endpoint_url: ${LLM_ENDPOINT_URL} + model: ${LLM_MODEL_ID} + temperature: ${TEMPERATURE} + max_new_tokens: ${MAX_TOKENS} + stream: false + tools: /home/user/tools/finqa_agent_tools.yaml + custom_prompt: /home/user/prompts/finqa_prompt.py + require_human_feedback: false + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + REDIS_URL_VECTOR: $REDIS_URL_VECTOR + REDIS_URL_KV: $REDIS_URL_KV + TEI_EMBEDDING_ENDPOINT: $TEI_EMBEDDING_ENDPOINT + port: 9095 + + worker-research-agent: + image: opea/agent:latest + container_name: research-agent-endpoint + volumes: + - ${TOOLSET_PATH}:/home/user/tools/ + - ${PROMPT_PATH}:/home/user/prompts/ + ports: + - "9096:9096" + ipc: host + environment: + ip_address: ${ip_address} + strategy: react_llama + with_memory: false + recursion_limit: 25 + llm_engine: vllm + HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + llm_endpoint_url: ${LLM_ENDPOINT_URL} + model: ${LLM_MODEL_ID} + stream: false + tools: /home/user/tools/research_agent_tools.yaml + custom_prompt: /home/user/prompts/research_prompt.py + require_human_feedback: false + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + FINNHUB_API_KEY: ${FINNHUB_API_KEY} + FINANCIAL_DATASETS_API_KEY: ${FINANCIAL_DATASETS_API_KEY} + port: 9096 + + supervisor-react-agent: + image: opea/agent:latest + container_name: supervisor-agent-endpoint + depends_on: + - worker-finqa-agent + - worker-research-agent + volumes: + - ${TOOLSET_PATH}:/home/user/tools/ + - ${PROMPT_PATH}:/home/user/prompts/ + ports: + - "9090:9090" + ipc: host + environment: + ip_address: ${ip_address} + strategy: react_llama + with_memory: true + recursion_limit: ${recursion_limit_supervisor} + llm_engine: vllm + HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + llm_endpoint_url: ${LLM_ENDPOINT_URL} + model: ${LLM_MODEL_ID} + temperature: ${TEMPERATURE} + max_new_tokens: ${MAX_TOKENS} + stream: true + tools: /home/user/tools/supervisor_agent_tools.yaml + custom_prompt: /home/user/prompts/supervisor_prompt.py + require_human_feedback: false + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + WORKER_FINQA_AGENT_URL: $WORKER_FINQA_AGENT_URL + WORKER_RESEARCH_AGENT_URL: $WORKER_RESEARCH_AGENT_URL + DOCSUM_ENDPOINT: $DOCSUM_ENDPOINT + REDIS_URL_VECTOR: $REDIS_URL_VECTOR + REDIS_URL_KV: $REDIS_URL_KV + TEI_EMBEDDING_ENDPOINT: $TEI_EMBEDDING_ENDPOINT + port: 9090 + docsum-llm-textgen: + image: ${REGISTRY:-opea}/llm-docsum:${TAG:-latest} + container_name: docsum-llm-server + ports: + - "${DOCSUM_LLM_SERVER_PORT}:9000" + ipc: host + environment: + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + LLM_ENDPOINT: ${LLM_ENDPOINT} + HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + MAX_INPUT_TOKENS: ${MAX_INPUT_TOKENS} + MAX_TOTAL_TOKENS: ${MAX_TOTAL_TOKENS} + LLM_MODEL_ID: ${LLM_MODEL_ID} + DocSum_COMPONENT_NAME: "DocSum_COMPONENT_NAME:-OpeaDocSumvLLM" + LOGFLAG: ${LOGFLAG:-False} + restart: unless-stopped + + agent-ui: + image: opea/agent-ui:latest + container_name: agent-ui + environment: + host_ip: ${host_ip} + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + ports: + - "5175:8080" + ipc: host diff --git a/FinanceAgent/docker_compose/amd/gpu/rocm/compose_vllm.yaml b/FinanceAgent/docker_compose/amd/gpu/rocm/compose_vllm.yaml new file mode 100644 index 0000000000..8fe2226d0b --- /dev/null +++ b/FinanceAgent/docker_compose/amd/gpu/rocm/compose_vllm.yaml @@ -0,0 +1,39 @@ +# Copyright (C) 2025 Advanced Micro Devices, Inc. +# SPDX-License-Identifier: Apache-2.0 + +services: + vllm-service: + image: ${REGISTRY:-opea}/vllm-rocm:${TAG:-latest} + container_name: vllm-service + ports: + - "${FINANCEAGENT_VLLM_SERVICE_PORT:-8081}:8011" + environment: + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + HF_HUB_DISABLE_PROGRESS_BARS: 1 + HF_HUB_ENABLE_HF_TRANSFER: 0 + VLLM_USE_TRITON_FLASH_ATTENTION: 0 + PYTORCH_JIT: 0 + healthcheck: + test: [ "CMD-SHELL", "curl -f http://${HOST_IP}:${FINANCEAGENT_VLLM_SERVICE_PORT:-8081}/health || exit 1" ] + interval: 10s + timeout: 10s + retries: 100 + volumes: + - "${MODEL_CACHE:-./data}:/data" + shm_size: 20G + devices: + - /dev/kfd:/dev/kfd + - /dev/dri/:/dev/dri/ + cap_add: + - SYS_PTRACE + group_add: + - video + security_opt: + - seccomp:unconfined + - apparmor=unconfined + command: "--model ${LLM_MODEL_ID} --swap-space 16 --disable-log-requests --dtype float16 --tensor-parallel-size 4 --host 0.0.0.0 --port 8011 --num-scheduler-steps 1 --distributed-executor-backend \"mp\"" + ipc: host diff --git a/FinanceAgent/docker_compose/amd/gpu/rocm/dataprep_compose.yaml b/FinanceAgent/docker_compose/amd/gpu/rocm/dataprep_compose.yaml new file mode 100644 index 0000000000..b5eaf1f77b --- /dev/null +++ b/FinanceAgent/docker_compose/amd/gpu/rocm/dataprep_compose.yaml @@ -0,0 +1,82 @@ +# Copyright (C) 2025 Advanced Micro Devices, Inc. +# SPDX-License-Identifier: Apache-2.0 + +services: + tei-embedding-serving: + image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 + container_name: tei-embedding-serving + entrypoint: /bin/sh -c "apt-get update && apt-get install -y curl && text-embeddings-router --json-output --model-id ${EMBEDDING_MODEL_ID} --auto-truncate" + ports: + - "${TEI_EMBEDDER_PORT:-10221}:80" + volumes: + - "./data:/data" + shm_size: 1g + environment: + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + host_ip: ${HOST_IP} + HF_TOKEN: ${HF_TOKEN} + healthcheck: + test: ["CMD", "curl", "-f", "http://${host_ip}:${TEI_EMBEDDER_PORT}/health"] + interval: 10s + timeout: 6s + retries: 48 + + redis-vector-db: + image: redis/redis-stack:7.2.0-v9 + container_name: redis-vector-db + ports: + - "${REDIS_PORT1:-6379}:6379" + - "${REDIS_PORT2:-8001}:8001" + environment: + - no_proxy=${no_proxy} + - http_proxy=${http_proxy} + - https_proxy=${https_proxy} + healthcheck: + test: ["CMD", "redis-cli", "ping"] + timeout: 10s + retries: 3 + start_period: 10s + + redis-kv-store: + image: redis/redis-stack:7.2.0-v9 + container_name: redis-kv-store + ports: + - "${REDIS_PORT3:-6380}:6379" + - "${REDIS_PORT4:-8002}:8001" + environment: + - no_proxy=${no_proxy} + - http_proxy=${http_proxy} + - https_proxy=${https_proxy} + healthcheck: + test: ["CMD", "redis-cli", "ping"] + timeout: 10s + retries: 3 + start_period: 10s + + dataprep-redis-finance: + image: ${REGISTRY:-opea}/dataprep:${TAG:-latest} + container_name: dataprep-redis-server-finance + depends_on: + redis-vector-db: + condition: service_healthy + redis-kv-store: + condition: service_healthy + tei-embedding-serving: + condition: service_healthy + ports: + - "${DATAPREP_PORT:-6007}:5000" + environment: + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + DATAPREP_COMPONENT_NAME: ${DATAPREP_COMPONENT_NAME} + REDIS_URL_VECTOR: ${REDIS_URL_VECTOR} + REDIS_URL_KV: ${REDIS_URL_KV} + TEI_EMBEDDING_ENDPOINT: ${TEI_EMBEDDING_ENDPOINT} + LLM_ENDPOINT: ${LLM_ENDPOINT} + LLM_MODEL: ${LLM_MODEL} + HUGGINGFACEHUB_API_TOKEN: ${HF_TOKEN} + HF_TOKEN: ${HF_TOKEN} + LOGFLAG: true diff --git a/FinanceAgent/docker_compose/amd/gpu/rocm/launch_agents.sh b/FinanceAgent/docker_compose/amd/gpu/rocm/launch_agents.sh new file mode 100644 index 0000000000..db3ec09b99 --- /dev/null +++ b/FinanceAgent/docker_compose/amd/gpu/rocm/launch_agents.sh @@ -0,0 +1,38 @@ +# Copyright (C) 2025 Advanced Micro Devices, Inc. +# SPDX-License-Identifier: Apache-2.0 + +export ip_address=$(hostname -I | awk '{print $1}') +export HOST_IP=${ip_address} +export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN} +export HF_TOKEN=${HUGGINGFACEHUB_API_TOKEN} +export TOOLSET_PATH=$WORKDIR/GenAIExamples/FinanceAgent/tools/ +echo "TOOLSET_PATH=${TOOLSET_PATH}" +export PROMPT_PATH=$WORKDIR/GenAIExamples/FinanceAgent/prompts/ +echo "PROMPT_PATH=${PROMPT_PATH}" +export recursion_limit_worker=12 +export recursion_limit_supervisor=10 + +export vllm_port=8086 +export FINANCEAGENT_VLLM_SERVICE_PORT=${vllm_port} +export LLM_MODEL_ID="meta-llama/Llama-3.3-70B-Instruct" +export LLM_ENDPOINT_URL="http://${ip_address}:${vllm_port}" +export TEMPERATURE=0.5 +export MAX_TOKENS=4096 + +export WORKER_FINQA_AGENT_URL="http://${ip_address}:9095/v1/chat/completions" +export WORKER_RESEARCH_AGENT_URL="http://${ip_address}:9096/v1/chat/completions" + +export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5" +export TEI_EMBEDDING_ENDPOINT="http://${ip_address}:10221" +export REDIS_URL_VECTOR="redis://${ip_address}:6379" +export REDIS_URL_KV="redis://${ip_address}:6380" + +export MAX_INPUT_TOKENS=2048 +export MAX_TOTAL_TOKENS=4096 +export DOCSUM_COMPONENT_NAME="OpeaDocSumvLLM" +export DOCSUM_ENDPOINT="http://${ip_address}:9000/v1/docsum" + +export FINNHUB_API_KEY=${FINNHUB_API_KEY} +export FINANCIAL_DATASETS_API_KEY=${FINANCIAL_DATASETS_API_KEY} + +docker compose -f compose.yaml up -d diff --git a/FinanceAgent/docker_compose/amd/gpu/rocm/launch_dataprep.sh b/FinanceAgent/docker_compose/amd/gpu/rocm/launch_dataprep.sh new file mode 100644 index 0000000000..31762da9d3 --- /dev/null +++ b/FinanceAgent/docker_compose/amd/gpu/rocm/launch_dataprep.sh @@ -0,0 +1,15 @@ +# Copyright (C) 2025 2025 Advanced Micro Devices, Inc. +# SPDX-License-Identifier: Apache-2.0 + +export host_ip=${ip_address} +export DATAPREP_PORT="6007" +export TEI_EMBEDDER_PORT="10221" +export REDIS_URL_VECTOR="redis://${ip_address}:6379" +export REDIS_URL_KV="redis://${ip_address}:6380" +export LLM_MODEL=$model +export LLM_ENDPOINT="http://${ip_address}:${vllm_port}" +export DATAPREP_COMPONENT_NAME="OPEA_DATAPREP_REDIS_FINANCE" +export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5" +export TEI_EMBEDDING_ENDPOINT="http://${ip_address}:${TEI_EMBEDDER_PORT}" + +docker compose -f dataprep_compose.yaml up -d diff --git a/FinanceAgent/docker_compose/amd/gpu/rocm/launch_vllm.sh b/FinanceAgent/docker_compose/amd/gpu/rocm/launch_vllm.sh new file mode 100644 index 0000000000..638660d7fb --- /dev/null +++ b/FinanceAgent/docker_compose/amd/gpu/rocm/launch_vllm.sh @@ -0,0 +1,9 @@ +# Copyright (C) 2025 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +export LLM_MODEL_ID="meta-llama/Llama-3.3-70B-Instruct" +#export LLM_MODEL_ID="meta-llama/Meta-Llama-3-8B-Instruct" +export MAX_LEN=16384 + + +docker compose -f vllm_compose.yaml up -d diff --git a/FinanceAgent/docker_image_build/build.yaml b/FinanceAgent/docker_image_build/build.yaml index 7d113148a3..23d1af7b76 100644 --- a/FinanceAgent/docker_image_build/build.yaml +++ b/FinanceAgent/docker_image_build/build.yaml @@ -20,3 +20,8 @@ services: https_proxy: ${https_proxy} no_proxy: ${no_proxy} image: ${REGISTRY:-opea}/agent:${TAG:-latest} + vllm-rocm: + build: + context: GenAIComps + dockerfile: comps/third_parties/vllm/src/Dockerfile.amd_gpu + image: ${REGISTRY:-opea}/vllm-rocm:${TAG:-latest} diff --git a/FinanceAgent/tests/test_compose_vllm_on_rocm.sh b/FinanceAgent/tests/test_compose_vllm_on_rocm.sh new file mode 100644 index 0000000000..01131449a9 --- /dev/null +++ b/FinanceAgent/tests/test_compose_vllm_on_rocm.sh @@ -0,0 +1,245 @@ +#!/bin/bash +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 +set -xe + +export WORKPATH=$(dirname "$PWD") +export WORKDIR=$WORKPATH/../../ +echo "WORKDIR=${WORKDIR}" +export ip_address=$(hostname -I | awk '{print $1}') +LOG_PATH=$WORKPATH + +#### env vars for LLM endpoint ############# +model=meta-llama/Llama-3.3-70B-Instruct +export LLM_MODEL_ID=$model +export MAX_LEN=16384 +vllm_image=opea/vllm-rocm:latest +vllm_port=8086 +export FINANCEAGENT_VLLM_SERVICE_PORT=$vllm_port +vllm_image=$vllm_image +HF_CACHE_DIR=${model_cache:-"./data"} +vllm_volume=${HF_CACHE_DIR} +####################################### + +#### env vars for dataprep ############# +export host_ip=${ip_address} +export DATAPREP_PORT="6007" +export TEI_EMBEDDER_PORT="10221" +export REDIS_URL_VECTOR="redis://${ip_address}:6379" +export REDIS_URL_KV="redis://${ip_address}:6380" +export LLM_MODEL=$model +export LLM_ENDPOINT="http://${ip_address}:${vllm_port}" +export DATAPREP_COMPONENT_NAME="OPEA_DATAPREP_REDIS_FINANCE" +export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5" +export TEI_EMBEDDING_ENDPOINT="http://${ip_address}:${TEI_EMBEDDER_PORT}" +####################################### + + + +function get_genai_comps() { + if [ ! -d "GenAIComps" ] ; then + git clone --depth 1 --branch ${opea_branch:-"main"} https://github.com/opea-project/GenAIComps.git + fi +} + +function build_dataprep_agent_and_vllm_images() { + cd $WORKDIR/GenAIExamples/FinanceAgent/docker_image_build/ + get_genai_comps + echo "Build agent image with --no-cache..." + docker compose -f build.yaml build --no-cache +} + +function build_agent_image_local(){ + cd $WORKDIR/GenAIComps/ + docker build -t opea/agent:latest -f comps/agent/src/Dockerfile . --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy +} + +function start_vllm_service { + echo "start vllm service" + docker compose -f $WORKPATH/docker_compose/amd/gpu/rocm/compose_vllm.yaml up -d + sleep 1m + echo "Waiting vllm rocm ready" + n=0 + until [[ "$n" -ge 500 ]]; do + docker logs vllm-service >& "${LOG_PATH}"/vllm-service_start.log + if grep -q "Application startup complete" "${LOG_PATH}"/vllm-service_start.log; then + break + fi + sleep 10s + n=$((n+1)) + done + sleep 10s + echo "Service started successfully" +} + + +function stop_llm(){ + cid=$(docker ps -aq --filter "name=vllm-service") + echo "Stopping container $cid" + if [[ ! -z "$cid" ]]; then docker rm $cid -f && sleep 1s; fi + +} + +function start_dataprep(){ + docker compose -f $WORKPATH/docker_compose/amd/gpu/rocm/dataprep_compose.yaml up -d + sleep 1m +} + +function validate() { + local CONTENT="$1" + local EXPECTED_RESULT="$2" + local SERVICE_NAME="$3" + echo "EXPECTED_RESULT: $EXPECTED_RESULT" + echo "Content: $CONTENT" + if echo "$CONTENT" | grep -q "$EXPECTED_RESULT"; then + echo "[ $SERVICE_NAME ] Content is as expected: $CONTENT" + echo 0 + else + echo "[ $SERVICE_NAME ] Content does not match the expected result: $CONTENT" + echo 1 + fi +} + +function ingest_validate_dataprep() { + # test /v1/dataprep/ingest + echo "=========== Test ingest ===========" + local CONTENT=$(python3 $WORKPATH/tests/test_redis_finance.py --port $DATAPREP_PORT --test_option ingest) + local EXIT_CODE=$(validate "$CONTENT" "200" "dataprep-redis-finance") + echo "$EXIT_CODE" + local EXIT_CODE="${EXIT_CODE:0-1}" + if [ "$EXIT_CODE" == "1" ]; then + docker logs dataprep-redis-server-finance + exit 1 + fi + + # test /v1/dataprep/get + echo "=========== Test get ===========" + local CONTENT=$(python3 $WORKPATH/tests/test_redis_finance.py --port $DATAPREP_PORT --test_option get) + local EXIT_CODE=$(validate "$CONTENT" "Request successful" "dataprep-redis-finance") + echo "$EXIT_CODE" + local EXIT_CODE="${EXIT_CODE:0-1}" + if [ "$EXIT_CODE" == "1" ]; then + docker logs dataprep-redis-server-finance + exit 1 + fi +} + +function stop_dataprep() { + echo "Stopping databases" + cid=$(docker ps -aq --filter "name=dataprep-redis-server*" --filter "name=redis-*" --filter "name=tei-embedding-*") + if [[ ! -z "$cid" ]]; then docker stop $cid && docker rm $cid && sleep 1s; fi + +} + +function start_agents() { + echo "Starting Agent services" + cd $WORKDIR/GenAIExamples/FinanceAgent/docker_compose/amd/gpu/rocm/ + bash launch_agents.sh + sleep 2m +} + + +function validate_agent_service() { + # # test worker finqa agent + echo "======================Testing worker finqa agent======================" + export agent_port="9095" + prompt="What is Gap's revenue in 2024?" + local CONTENT=$(python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port) + echo $CONTENT + local EXIT_CODE=$(validate "$CONTENT" "15" "finqa-agent-endpoint") + echo $EXIT_CODE + local EXIT_CODE="${EXIT_CODE:0-1}" + if [ "$EXIT_CODE" == "1" ]; then + docker logs finqa-agent-endpoint + exit 1 + fi + + # # test worker research agent + echo "======================Testing worker research agent======================" + export agent_port="9096" + prompt="Johnson & Johnson" + local CONTENT=$(python3 $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port --tool_choice "get_current_date" --tool_choice "get_share_performance") + local EXIT_CODE=$(validate "$CONTENT" "Johnson" "research-agent-endpoint") + echo $CONTENT + echo $EXIT_CODE + local EXIT_CODE="${EXIT_CODE:0-1}" + if [ "$EXIT_CODE" == "1" ]; then + docker logs research-agent-endpoint + exit 1 + fi + + # test supervisor react agent + echo "======================Testing supervisor agent: single turns ======================" + export agent_port="9090" + local CONTENT=$(python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --stream) + echo $CONTENT + local EXIT_CODE=$(validate "$CONTENT" "test completed with success" "supervisor-agent-endpoint") + echo $EXIT_CODE + local EXIT_CODE="${EXIT_CODE:0-1}" + if [ "$EXIT_CODE" == "1" ]; then + docker logs supervisor-agent-endpoint + exit 1 + fi + + # echo "======================Testing supervisor agent: multi turns ======================" + local CONTENT=$(python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --multi-turn --stream) + echo $CONTENT + local EXIT_CODE=$(validate "$CONTENT" "test completed with success" "supervisor-agent-endpoint") + echo $EXIT_CODE + local EXIT_CODE="${EXIT_CODE:0-1}" + if [ "$EXIT_CODE" == "1" ]; then + docker logs supervisor-agent-endpoint + exit 1 + fi + +} + +function stop_agent_docker() { + cd $WORKPATH/docker_compose/amd/gpu/rocm/ + container_list=$(cat compose.yaml | grep container_name | cut -d':' -f2) + for container_name in $container_list; do + cid=$(docker ps -aq --filter "name=$container_name") + echo "Stopping container $container_name" + if [[ ! -z "$cid" ]]; then docker rm $cid -f && sleep 1s; fi + done +} + + +echo "workpath: $WORKPATH" +echo "=================== Stop containers ====================" +stop_llm +stop_agent_docker +stop_dataprep + +cd $WORKPATH/tests + +echo "=================== #1 Building docker images====================" +build_dataprep_agent_and_vllm_images + +#### for local test +# build_agent_image_local +# echo "=================== #1 Building docker images completed====================" + +echo "=================== #2 Start vllm endpoint====================" +start_vllm_service +echo "=================== #2 vllm endpoint started====================" + +echo "=================== #3 Start dataprep and ingest data ====================" +start_dataprep +ingest_validate_dataprep +echo "=================== #3 Data ingestion and validation completed====================" + +echo "=================== #4 Start agents ====================" +start_agents +validate_agent_service +echo "=================== #4 Agent test passed ====================" + +echo "=================== #5 Stop microservices ====================" +stop_agent_docker +stop_dataprep +stop_llm +echo "=================== #5 Microservices stopped====================" + +echo y | docker system prune + +echo "ALL DONE!!"