You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: AgentQnA/docker_compose/intel/cpu/xeon/README.md
+16-10Lines changed: 16 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -52,11 +52,11 @@ export no_proxy=localhost,127.0.0.1,$host_ip # additional no proxies if needed
52
52
export NGINX_PORT=${your_nginx_port}# your usable port for nginx, 80 for example
53
53
```
54
54
55
-
#### [Optional] OPENAI_API_KEY to use OpenAI models or Intel® AI for Enterprise Inference
55
+
#### [Optional] OPENAI_API_KEY to use OpenAI models or LLM models with remote endpoints
56
56
57
57
To use OpenAI models, generate a key following these [instructions](https://platform.openai.com/api-keys).
58
58
59
-
To use a remote server running Intel® AI for Enterprise Inference, contact the cloud service provider or owner of the on-prem machine for a keyto access the desired model on the server.
59
+
When models are deployed on a remote server, a base URL and an API key are required to access them. To set up a remote server and acquire the base URL and API key, refer to [Intel® AI for Enterprise Inference](https://www.intel.com/content/www/us/en/developer/topic-technology/artificial-intelligence/enterprise-inference.html) offerings.
60
60
61
61
Then set the environment variable `OPENAI_API_KEY` with the key contents:
We make it convenient to launch the whole system with docker compose, which includes microservices for LLM, agents, UI, retrieval tool, vector database, dataprep, and telemetry. There are 3 docker compose files, which make it easy for users to pick and choose. Users can choose a different retrieval tool other than the `DocIndexRetriever` example provided in our GenAIExamples repo. Users can choose not to launch the telemetry containers.
76
76
77
-
On Xeon, OpenAI models and models deployed on a remote server are supported. Both methods require an API key.
77
+
On Xeon, OpenAI models and models deployed on a remote server are supported. Both methods require an API key where `OPENAI_API_KEY` needs to be set in the [previous step](#optional-openai_api_key-to-use-openai-models-or-llm-models-with-remote-endpoints).
@@ -88,19 +88,25 @@ The command below will launch the multi-agent system with the `DocIndexRetriever
88
88
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose_openai.yaml up -d
89
89
```
90
90
91
-
#### Models on Remote Server
91
+
#### Models on Remote Servers
92
92
93
93
When models are deployed on a remote server with Intel® AI for Enterprise Inference, a base URL and an API key are required to access them. To run the Agent microservice on Xeon while using models deployed on a remote server, add `compose_remote.yaml` to the `docker compose` command and set additional environment variables.
94
94
95
-
#### Notes
95
+
> **Note**: For AgentQnA, the minimum hardware requirement for the remote server is Intel® Gaudi® AI Accelerators.
96
96
97
-
-`OPENAI_API_KEY` is already set in a previous step.
98
-
-`model` is used to overwrite the value set for this environment variable in `set_env.sh`.
99
-
-`LLM_ENDPOINT_URL` is the base URL given from the owner of the on-prem machine or cloud service provider. It will follow this format: "https://<DNS>". Here is an example: "https://api.inference.example.com".
97
+
Set the following environment variables.
98
+
99
+
-`REMOTE_ENDPOINT` is the HTTPS endpoint of the remote server with the model of choice (i.e. https://api.example.com). **Note:** If the API for the models does not use LiteLLM, the second part of the model card needs to be appended to the URL. For example, set `REMOTE_ENDPOINT` to https://api.example.com/Llama-3.3-70B-Instruct if the model card is `meta-llama/Llama-3.3-70B-Instruct`.
100
+
-`model` is the model card which may need to be overwritten depending on what it is set to `set_env.sh`.
0 commit comments