@@ -75,15 +75,15 @@ You can download the [latest release](https://github.com/bytedance/UI-TARS-deskt
75
75
76
76
### Deployment
77
77
78
- ### Cloud Deployment
78
+ #### Cloud Deployment
79
79
We recommend using HuggingFace Inference Endpoints for fast deployment.
80
80
We provide two docs for users to refer:
81
81
82
82
English version: [ GUI Model Deployment Guide] ( https://juniper-switch-f10.notion.site/GUI-Model-Deployment-Guide-17b5350241e280058e98cea60317de71 )
83
83
84
84
中文版: [ GUI模型部署教程] ( https://bytedance.sg.larkoffice.com/docx/TCcudYwyIox5vyxiSDLlgIsTgWf#U94rdCxzBoJMLex38NPlHL21gNb )
85
85
86
- ### Local Deployment [ vLLM]
86
+ #### Local Deployment [ vLLM]
87
87
We recommend using vLLM for fast deployment and inference. You need to use ` vllm>=0.6.1 ` .
88
88
``` bash
89
89
pip install -U transformers
@@ -92,7 +92,7 @@ CUDA_VERSION=cu124
92
92
pip install vllm==${VLLM_VERSION} --extra-index-url https://download.pytorch.org/whl/${CUDA_VERSION}
93
93
94
94
```
95
- #### Download the Model
95
+ ##### Download the Model
96
96
We provide three model sizes on Hugging Face: ** 2B** , ** 7B** , and ** 72B** . To achieve the best performance, we recommend using the ** 7B-DPO** or ** 72B-DPO** model (based on your hardware configuration):
97
97
98
98
- [ 2B-SFT] ( https://huggingface.co/bytedance-research/UI-TARS-2B-SFT )
@@ -102,13 +102,14 @@ We provide three model sizes on Hugging Face: **2B**, **7B**, and **72B**. To ac
102
102
- [ 72B-DPO] ( https://huggingface.co/bytedance-research/UI-TARS-72B-DPO )
103
103
104
104
105
- #### Start an OpenAI API Service
105
+ ##### Start an OpenAI API Service
106
106
Run the command below to start an OpenAI-compatible API service:
107
107
108
108
``` bash
109
109
python -m vllm.entrypoints.openai.api_server --served-model-name ui-tars --model < path to your model>
110
110
```
111
111
112
+ ##### Input your API key
112
113
113
114
<img src =" ./images/settings_model.png " width =" 500px " />
114
115
0 commit comments