docs(ui-tars): tweak README.md

ulivz · ulivz · commit c3dc79b19cd6 · 2025-03-19T19:38:30.000+08:00
diff --git a/README.md b/README.md
@@ -22,14 +22,6 @@ UI-TARS Desktop is a GUI Agent application based on [UI-TARS (Vision-Language Mo
 | &nbsp&nbsp 👓 <a href="https://github.com/web-infra-dev/midscene">Midscene (use in browser)</a>
 </p>
 
-### ⚠️ Important Announcement: GGUF Model Performance
-
-The **GGUF model** has undergone quantization, but unfortunately, its performance cannot be guaranteed. As a result, we have decided to **downgrade** it.
-
-💡 **Alternative Solution**:
-You can use **[Cloud Deployment](#cloud-deployment)** or **[Local Deployment [vLLM]](#local-deployment-vllm)**(If you have enough GPU resources) instead.
-
-We appreciate your understanding and patience as we work to ensure the best possible experience.
 
 ## Updates
 
@@ -53,95 +45,19 @@ We appreciate your understanding and patience as we work to ensure the best poss
 
 ## Quick Start
 
-### Download
-
-You can download the [latest release](https://github.com/bytedance/UI-TARS-desktop/releases/latest) version of UI-TARS Desktop from our releases page.
-
-> **Note**: If you have [Homebrew](https://brew.sh/) installed, you can install UI-TARS Desktop by running the following command:
-> ```bash
-> brew install --cask ui-tars
-> ```
-
-### Install
-
-#### MacOS
-
-1. Drag **UI TARS** application into the **Applications** folder
-  <img src="./apps/ui-tars/images/mac_install.png" width="500px" />
-
-2. Enable the permission of **UI TARS** in MacOS:
-  - System Settings -> Privacy & Security -> **Accessibility**
-  - System Settings -> Privacy & Security -> **Screen Recording**
-  <img src="./apps/ui-tars/images/mac_permission.png" width="500px" />
-
-3. Then open **UI TARS** application, you can see the following interface:
-  <img src="./apps/ui-tars/images/mac_app.png" width="500px" />
-
-
-#### Windows
-
-**Still to run** the application, you can see the following interface:
-
-<img src="./apps/ui-tars/images/windows_install.png" width="400px" />
-
-### Deployment
-
-#### Cloud Deployment
-We recommend using HuggingFace Inference Endpoints for fast deployment.
-We provide two docs for users to refer:
-
-English version: [GUI Model Deployment Guide](https://juniper-switch-f10.notion.site/GUI-Model-Deployment-Guide-17b5350241e280058e98cea60317de71)
-
-中文版: [GUI模型部署教程](https://bytedance.sg.larkoffice.com/docx/TCcudYwyIox5vyxiSDLlgIsTgWf#U94rdCxzBoJMLex38NPlHL21gNb)
-
-#### Local Deployment [vLLM]
-We recommend using vLLM for fast deployment and inference. You need to use `vllm>=0.6.1`.
-```bash
-pip install -U transformers
-VLLM_VERSION=0.6.6
-CUDA_VERSION=cu124
-pip install vllm==${VLLM_VERSION} --extra-index-url https://download.pytorch.org/whl/${CUDA_VERSION}
-
-```
-##### Download the Model
-We provide three model sizes on Hugging Face: **2B**, **7B**, and **72B**. To achieve the best performance, we recommend using the **7B-DPO** or **72B-DPO** model (based on your hardware configuration):
-
-- [2B-SFT](https://huggingface.co/bytedance-research/UI-TARS-2B-SFT)
-- [7B-SFT](https://huggingface.co/bytedance-research/UI-TARS-7B-SFT)
-- [7B-DPO](https://huggingface.co/bytedance-research/UI-TARS-7B-DPO)
-- [72B-SFT](https://huggingface.co/bytedance-research/UI-TARS-72B-SFT)
-- [72B-DPO](https://huggingface.co/bytedance-research/UI-TARS-72B-DPO)
-
-
-##### Start an OpenAI API Service
-Run the command below to start an OpenAI-compatible API service:
-
-```bash
-python -m vllm.entrypoints.openai.api_server --served-model-name ui-tars --model <path to your model>
-```
-
-##### Input your API information
-
-<img src="./apps/ui-tars/images/settings_model.png" width="500px" />
-
-<!-- If you use Ollama, you can use the following settings to start the server:
+See [Quick Start](./docs/quick-start.md).
 
-```yaml
-VLM Provider: ollama
-VLM Base Url: http://localhost:11434/v1
-VLM API Key: api_key
-VLM Model Name: ui-tars
-``` -->
+## Deployment
 
-> **Note**: VLM Base Url is OpenAI compatible API endpoints (see [OpenAI API protocol document](https://platform.openai.com/docs/guides/vision/uploading-base-64-encoded-images) for more details).
+See [Deployment](./docs/deployment.md).
 
 ## Contributing
 
-[CONTRIBUTING.md](./CONTRIBUTING.md)
+See [CONTRIBUTING.md](./CONTRIBUTING.md).
 
-## SDK(Experimental)
+## SDK (Experimental)
 
-[SDK](./docs/sdk.md)
+See [UI TARS SDK](./docs/sdk.md)
 
 ## License
 
diff --git a/docs/deployment.md b/docs/deployment.md
@@ -0,0 +1,60 @@
+# Deployment
+
+### ⚠️ Important Announcement: GGUF Model Performance
+
+The **GGUF model** has undergone quantization, but unfortunately, its performance cannot be guaranteed. As a result, we have decided to **downgrade** it.
+
+💡 **Alternative Solution**:
+You can use **[Cloud Deployment](#cloud-deployment)** or **[Local Deployment [vLLM]](#local-deployment-vllm)**(If you have enough GPU resources) instead.
+
+We appreciate your understanding and patience as we work to ensure the best possible experience.
+
+## Cloud Deployment
+
+We recommend using HuggingFace Inference Endpoints for fast deployment.
+We provide two docs for users to refer:
+
+English version: [GUI Model Deployment Guide](https://juniper-switch-f10.notion.site/GUI-Model-Deployment-Guide-17b5350241e280058e98cea60317de71)
+
+中文版: [GUI模型部署教程](https://bytedance.sg.larkoffice.com/docx/TCcudYwyIox5vyxiSDLlgIsTgWf#U94rdCxzBoJMLex38NPlHL21gNb)
+
+## Local Deployment [vLLM]
+We recommend using vLLM for fast deployment and inference. You need to use `vllm>=0.6.1`.
+```bash
+pip install -U transformers
+VLLM_VERSION=0.6.6
+CUDA_VERSION=cu124
+pip install vllm==${VLLM_VERSION} --extra-index-url https://download.pytorch.org/whl/${CUDA_VERSION}
+
+```
+### Download the Model
+We provide three model sizes on Hugging Face: **2B**, **7B**, and **72B**. To achieve the best performance, we recommend using the **7B-DPO** or **72B-DPO** model (based on your hardware configuration):
+
+- [2B-SFT](https://huggingface.co/bytedance-research/UI-TARS-2B-SFT)
+- [7B-SFT](https://huggingface.co/bytedance-research/UI-TARS-7B-SFT)
+- [7B-DPO](https://huggingface.co/bytedance-research/UI-TARS-7B-DPO)
+- [72B-SFT](https://huggingface.co/bytedance-research/UI-TARS-72B-SFT)
+- [72B-DPO](https://huggingface.co/bytedance-research/UI-TARS-72B-DPO)
+
+
+### Start an OpenAI API Service
+Run the command below to start an OpenAI-compatible API service:
+
+```bash
+python -m vllm.entrypoints.openai.api_server --served-model-name ui-tars --model <path to your model>
+```
+
+### Input your API information
+
+<img src="../apps/ui-tars/images/settings_model.png" width="500px" />
+
+<!-- If you use Ollama, you can use the following settings to start the server:
+
+```yaml
+VLM Provider: ollama
+VLM Base Url: http://localhost:11434/v1
+VLM API Key: api_key
+VLM Model Name: ui-tars
+``` -->
+
+> **Note**: VLM Base Url is OpenAI compatible API endpoints (see [OpenAI API protocol document](https://platform.openai.com/docs/guides/vision/uploading-base-64-encoded-images) for more details).
diff --git a/docs/quick-start.md b/docs/quick-start.md
@@ -0,0 +1,33 @@
+# Quick Start
+
+## Download
+
+You can download the [latest release](https://github.com/bytedance/UI-TARS-desktop/releases/latest) version of UI-TARS Desktop from our releases page.
+
+> **Note**: If you have [Homebrew](https://brew.sh/) installed, you can install UI-TARS Desktop by running the following command:
+> ```bash
+> brew install --cask ui-tars
+> ```
+
+## Install
+
+### MacOS
+
+1. Drag **UI TARS** application into the **Applications** folder
+  <img src="../apps/ui-tars/images/mac_install.png" width="500px" />
+
+2. Enable the permission of **UI TARS** in MacOS:
+  - System Settings -> Privacy & Security -> **Accessibility**
+  - System Settings -> Privacy & Security -> **Screen Recording**
+  <img src="../apps/ui-tars/images/mac_permission.png" width="500px" />
+
+3. Then open **UI TARS** application, you can see the following interface:
+  <img src="../apps/ui-tars/images/mac_app.png" width="500px" />
+
+
+### Windows
+
+**Still to run** the application, you can see the following interface:
+
+<img src="../apps/ui-tars/images/windows_install.png" width="400px" />
+