Skip to content

Commit c3dc79b

Browse files
committed
docs(ui-tars): tweak README.md
1 parent b61795c commit c3dc79b

File tree

3 files changed

+99
-90
lines changed

3 files changed

+99
-90
lines changed

README.md

Lines changed: 6 additions & 90 deletions
Original file line numberDiff line numberDiff line change
@@ -22,14 +22,6 @@ UI-TARS Desktop is a GUI Agent application based on [UI-TARS (Vision-Language Mo
2222
| &nbsp&nbsp 👓 <a href="https://github.com/web-infra-dev/midscene">Midscene (use in browser)</a>
2323
</p>
2424

25-
### ⚠️ Important Announcement: GGUF Model Performance
26-
27-
The **GGUF model** has undergone quantization, but unfortunately, its performance cannot be guaranteed. As a result, we have decided to **downgrade** it.
28-
29-
💡 **Alternative Solution**:
30-
You can use **[Cloud Deployment](#cloud-deployment)** or **[Local Deployment [vLLM]](#local-deployment-vllm)**(If you have enough GPU resources) instead.
31-
32-
We appreciate your understanding and patience as we work to ensure the best possible experience.
3325

3426
## Updates
3527

@@ -53,95 +45,19 @@ We appreciate your understanding and patience as we work to ensure the best poss
5345

5446
## Quick Start
5547

56-
### Download
57-
58-
You can download the [latest release](https://github.com/bytedance/UI-TARS-desktop/releases/latest) version of UI-TARS Desktop from our releases page.
59-
60-
> **Note**: If you have [Homebrew](https://brew.sh/) installed, you can install UI-TARS Desktop by running the following command:
61-
> ```bash
62-
> brew install --cask ui-tars
63-
> ```
64-
65-
### Install
66-
67-
#### MacOS
68-
69-
1. Drag **UI TARS** application into the **Applications** folder
70-
<img src="./apps/ui-tars/images/mac_install.png" width="500px" />
71-
72-
2. Enable the permission of **UI TARS** in MacOS:
73-
- System Settings -> Privacy & Security -> **Accessibility**
74-
- System Settings -> Privacy & Security -> **Screen Recording**
75-
<img src="./apps/ui-tars/images/mac_permission.png" width="500px" />
76-
77-
3. Then open **UI TARS** application, you can see the following interface:
78-
<img src="./apps/ui-tars/images/mac_app.png" width="500px" />
79-
80-
81-
#### Windows
82-
83-
**Still to run** the application, you can see the following interface:
84-
85-
<img src="./apps/ui-tars/images/windows_install.png" width="400px" />
86-
87-
### Deployment
88-
89-
#### Cloud Deployment
90-
We recommend using HuggingFace Inference Endpoints for fast deployment.
91-
We provide two docs for users to refer:
92-
93-
English version: [GUI Model Deployment Guide](https://juniper-switch-f10.notion.site/GUI-Model-Deployment-Guide-17b5350241e280058e98cea60317de71)
94-
95-
中文版: [GUI模型部署教程](https://bytedance.sg.larkoffice.com/docx/TCcudYwyIox5vyxiSDLlgIsTgWf#U94rdCxzBoJMLex38NPlHL21gNb)
96-
97-
#### Local Deployment [vLLM]
98-
We recommend using vLLM for fast deployment and inference. You need to use `vllm>=0.6.1`.
99-
```bash
100-
pip install -U transformers
101-
VLLM_VERSION=0.6.6
102-
CUDA_VERSION=cu124
103-
pip install vllm==${VLLM_VERSION} --extra-index-url https://download.pytorch.org/whl/${CUDA_VERSION}
104-
105-
```
106-
##### Download the Model
107-
We provide three model sizes on Hugging Face: **2B**, **7B**, and **72B**. To achieve the best performance, we recommend using the **7B-DPO** or **72B-DPO** model (based on your hardware configuration):
108-
109-
- [2B-SFT](https://huggingface.co/bytedance-research/UI-TARS-2B-SFT)
110-
- [7B-SFT](https://huggingface.co/bytedance-research/UI-TARS-7B-SFT)
111-
- [7B-DPO](https://huggingface.co/bytedance-research/UI-TARS-7B-DPO)
112-
- [72B-SFT](https://huggingface.co/bytedance-research/UI-TARS-72B-SFT)
113-
- [72B-DPO](https://huggingface.co/bytedance-research/UI-TARS-72B-DPO)
114-
115-
116-
##### Start an OpenAI API Service
117-
Run the command below to start an OpenAI-compatible API service:
118-
119-
```bash
120-
python -m vllm.entrypoints.openai.api_server --served-model-name ui-tars --model <path to your model>
121-
```
122-
123-
##### Input your API information
124-
125-
<img src="./apps/ui-tars/images/settings_model.png" width="500px" />
126-
127-
<!-- If you use Ollama, you can use the following settings to start the server:
48+
See [Quick Start](./docs/quick-start.md).
12849

129-
```yaml
130-
VLM Provider: ollama
131-
VLM Base Url: http://localhost:11434/v1
132-
VLM API Key: api_key
133-
VLM Model Name: ui-tars
134-
``` -->
50+
## Deployment
13551

136-
> **Note**: VLM Base Url is OpenAI compatible API endpoints (see [OpenAI API protocol document](https://platform.openai.com/docs/guides/vision/uploading-base-64-encoded-images) for more details).
52+
See [Deployment](./docs/deployment.md).
13753

13854
## Contributing
13955

140-
[CONTRIBUTING.md](./CONTRIBUTING.md)
56+
See [CONTRIBUTING.md](./CONTRIBUTING.md).
14157

142-
## SDK(Experimental)
58+
## SDK (Experimental)
14359

144-
[SDK](./docs/sdk.md)
60+
See [UI TARS SDK](./docs/sdk.md)
14561

14662
## License
14763

docs/deployment.md

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
# Deployment
2+
3+
### ⚠️ Important Announcement: GGUF Model Performance
4+
5+
The **GGUF model** has undergone quantization, but unfortunately, its performance cannot be guaranteed. As a result, we have decided to **downgrade** it.
6+
7+
💡 **Alternative Solution**:
8+
You can use **[Cloud Deployment](#cloud-deployment)** or **[Local Deployment [vLLM]](#local-deployment-vllm)**(If you have enough GPU resources) instead.
9+
10+
We appreciate your understanding and patience as we work to ensure the best possible experience.
11+
12+
## Cloud Deployment
13+
14+
We recommend using HuggingFace Inference Endpoints for fast deployment.
15+
We provide two docs for users to refer:
16+
17+
English version: [GUI Model Deployment Guide](https://juniper-switch-f10.notion.site/GUI-Model-Deployment-Guide-17b5350241e280058e98cea60317de71)
18+
19+
中文版: [GUI模型部署教程](https://bytedance.sg.larkoffice.com/docx/TCcudYwyIox5vyxiSDLlgIsTgWf#U94rdCxzBoJMLex38NPlHL21gNb)
20+
21+
## Local Deployment [vLLM]
22+
We recommend using vLLM for fast deployment and inference. You need to use `vllm>=0.6.1`.
23+
```bash
24+
pip install -U transformers
25+
VLLM_VERSION=0.6.6
26+
CUDA_VERSION=cu124
27+
pip install vllm==${VLLM_VERSION} --extra-index-url https://download.pytorch.org/whl/${CUDA_VERSION}
28+
29+
```
30+
### Download the Model
31+
We provide three model sizes on Hugging Face: **2B**, **7B**, and **72B**. To achieve the best performance, we recommend using the **7B-DPO** or **72B-DPO** model (based on your hardware configuration):
32+
33+
- [2B-SFT](https://huggingface.co/bytedance-research/UI-TARS-2B-SFT)
34+
- [7B-SFT](https://huggingface.co/bytedance-research/UI-TARS-7B-SFT)
35+
- [7B-DPO](https://huggingface.co/bytedance-research/UI-TARS-7B-DPO)
36+
- [72B-SFT](https://huggingface.co/bytedance-research/UI-TARS-72B-SFT)
37+
- [72B-DPO](https://huggingface.co/bytedance-research/UI-TARS-72B-DPO)
38+
39+
40+
### Start an OpenAI API Service
41+
Run the command below to start an OpenAI-compatible API service:
42+
43+
```bash
44+
python -m vllm.entrypoints.openai.api_server --served-model-name ui-tars --model <path to your model>
45+
```
46+
47+
### Input your API information
48+
49+
<img src="../apps/ui-tars/images/settings_model.png" width="500px" />
50+
51+
<!-- If you use Ollama, you can use the following settings to start the server:
52+
53+
```yaml
54+
VLM Provider: ollama
55+
VLM Base Url: http://localhost:11434/v1
56+
VLM API Key: api_key
57+
VLM Model Name: ui-tars
58+
``` -->
59+
60+
> **Note**: VLM Base Url is OpenAI compatible API endpoints (see [OpenAI API protocol document](https://platform.openai.com/docs/guides/vision/uploading-base-64-encoded-images) for more details).

docs/quick-start.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Quick Start
2+
3+
## Download
4+
5+
You can download the [latest release](https://github.com/bytedance/UI-TARS-desktop/releases/latest) version of UI-TARS Desktop from our releases page.
6+
7+
> **Note**: If you have [Homebrew](https://brew.sh/) installed, you can install UI-TARS Desktop by running the following command:
8+
> ```bash
9+
> brew install --cask ui-tars
10+
> ```
11+
12+
## Install
13+
14+
### MacOS
15+
16+
1. Drag **UI TARS** application into the **Applications** folder
17+
<img src="../apps/ui-tars/images/mac_install.png" width="500px" />
18+
19+
2. Enable the permission of **UI TARS** in MacOS:
20+
- System Settings -> Privacy & Security -> **Accessibility**
21+
- System Settings -> Privacy & Security -> **Screen Recording**
22+
<img src="../apps/ui-tars/images/mac_permission.png" width="500px" />
23+
24+
3. Then open **UI TARS** application, you can see the following interface:
25+
<img src="../apps/ui-tars/images/mac_app.png" width="500px" />
26+
27+
28+
### Windows
29+
30+
**Still to run** the application, you can see the following interface:
31+
32+
<img src="../apps/ui-tars/images/windows_install.png" width="400px" />
33+

0 commit comments

Comments
 (0)