Skip to content

Commit 40e3682

Browse files
AgentQnA: update instructions and env variable names for remote endpoints (opea-project#2113)
Signed-off-by: alexsin368 <alex.sin@intel.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: alexsin368 <alex.sin@intel.com>
1 parent c07da93 commit 40e3682

File tree

2 files changed

+19
-13
lines changed

2 files changed

+19
-13
lines changed

AgentQnA/docker_compose/intel/cpu/xeon/README.md

Lines changed: 16 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -52,11 +52,11 @@ export no_proxy=localhost,127.0.0.1,$host_ip # additional no proxies if needed
5252
export NGINX_PORT=${your_nginx_port} # your usable port for nginx, 80 for example
5353
```
5454

55-
#### [Optional] OPENAI_API_KEY to use OpenAI models or Intel® AI for Enterprise Inference
55+
#### [Optional] OPENAI_API_KEY to use OpenAI models or LLM models with remote endpoints
5656

5757
To use OpenAI models, generate a key following these [instructions](https://platform.openai.com/api-keys).
5858

59-
To use a remote server running Intel® AI for Enterprise Inference, contact the cloud service provider or owner of the on-prem machine for a key to access the desired model on the server.
59+
When models are deployed on a remote server, a base URL and an API key are required to access them. To set up a remote server and acquire the base URL and API key, refer to [Intel® AI for Enterprise Inference](https://www.intel.com/content/www/us/en/developer/topic-technology/artificial-intelligence/enterprise-inference.html) offerings.
6060

6161
Then set the environment variable `OPENAI_API_KEY` with the key contents:
6262

@@ -74,7 +74,7 @@ source $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon/set_env.sh
7474

7575
We make it convenient to launch the whole system with docker compose, which includes microservices for LLM, agents, UI, retrieval tool, vector database, dataprep, and telemetry. There are 3 docker compose files, which make it easy for users to pick and choose. Users can choose a different retrieval tool other than the `DocIndexRetriever` example provided in our GenAIExamples repo. Users can choose not to launch the telemetry containers.
7676

77-
On Xeon, OpenAI models and models deployed on a remote server are supported. Both methods require an API key.
77+
On Xeon, OpenAI models and models deployed on a remote server are supported. Both methods require an API key where `OPENAI_API_KEY` needs to be set in the [previous step](#optional-openai_api_key-to-use-openai-models-or-llm-models-with-remote-endpoints).
7878

7979
```bash
8080
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon
@@ -88,19 +88,25 @@ The command below will launch the multi-agent system with the `DocIndexRetriever
8888
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose_openai.yaml up -d
8989
```
9090

91-
#### Models on Remote Server
91+
#### Models on Remote Servers
9292

9393
When models are deployed on a remote server with Intel® AI for Enterprise Inference, a base URL and an API key are required to access them. To run the Agent microservice on Xeon while using models deployed on a remote server, add `compose_remote.yaml` to the `docker compose` command and set additional environment variables.
9494

95-
#### Notes
95+
> **Note**: For AgentQnA, the minimum hardware requirement for the remote server is Intel® Gaudi® AI Accelerators.
9696
97-
- `OPENAI_API_KEY` is already set in a previous step.
98-
- `model` is used to overwrite the value set for this environment variable in `set_env.sh`.
99-
- `LLM_ENDPOINT_URL` is the base URL given from the owner of the on-prem machine or cloud service provider. It will follow this format: "https://<DNS>". Here is an example: "https://api.inference.example.com".
97+
Set the following environment variables.
98+
99+
- `REMOTE_ENDPOINT` is the HTTPS endpoint of the remote server with the model of choice (i.e. https://api.example.com). **Note:** If the API for the models does not use LiteLLM, the second part of the model card needs to be appended to the URL. For example, set `REMOTE_ENDPOINT` to https://api.example.com/Llama-3.3-70B-Instruct if the model card is `meta-llama/Llama-3.3-70B-Instruct`.
100+
- `model` is the model card which may need to be overwritten depending on what it is set to `set_env.sh`.
101+
102+
```bash
103+
export REMOTE_ENDPOINT=<https-endpoint-of-remote-server>
104+
export model=<model-card>
105+
```
106+
107+
After setting these environment variables, run `docker compose` by adding `compose_remote.yaml` as an additional YAML file:
100108

101109
```bash
102-
export model=<name-of-model-card>
103-
export LLM_ENDPOINT_URL=<http-endpoint-of-remote-server>
104110
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose_openai.yaml -f compose_remote.yaml up -d
105111
```
106112

AgentQnA/docker_compose/intel/cpu/xeon/compose_remote.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,15 @@
44
services:
55
worker-rag-agent:
66
environment:
7-
llm_endpoint_url: ${LLM_ENDPOINT_URL}
7+
llm_endpoint_url: ${REMOTE_ENDPOINT}
88
api_key: ${OPENAI_API_KEY}
99

1010
worker-sql-agent:
1111
environment:
12-
llm_endpoint_url: ${LLM_ENDPOINT_URL}
12+
llm_endpoint_url: ${REMOTE_ENDPOINT}
1313
api_key: ${OPENAI_API_KEY}
1414

1515
supervisor-react-agent:
1616
environment:
17-
llm_endpoint_url: ${LLM_ENDPOINT_URL}
17+
llm_endpoint_url: ${REMOTE_ENDPOINT}
1818
api_key: ${OPENAI_API_KEY}

0 commit comments

Comments
 (0)