Skip to content

Commit 8e610b7

Browse files
xzyaoixzyao-agentCopilot
authored
Bump versions to 0.1.5 (#42)
* Feature/shepherd (#33) * wip: router * improve router * improve router * raise error if model name is not supported * support structured output (#35) * function calling mode (#36) * minor * Add Basic CI/CD Workflow (#37) * Add basic CI/CD workflow * Update CI workflow to target dev branch * update workflow * limit matrix (since we use docker) * fix ci/cd issues * update ci to load HF_TOKEN * load environment variable for testing --------- Co-authored-by: Xiaozhe Yao <askxzyao@gmail.com> * [feature/xzyao-agent] Add extended API test cases (#38) * feat: Add extended API test cases * set ci workflow * update ci * update ci * update ci --------- Co-authored-by: Xiaozhe Yao <askxzyao@gmail.com> * Update issue templates * Improving Shepherd Router (#39) * minor * refactor router creation * minor change default values * wip: request routing * wip: refactor * update routing policies * update learned router with penalty * update router * improve router * Update tools/shepherd/plot_router_results.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update scratchpad/extensions/shepherd/policies/_base.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * add script to build xgrammar * update build config on gh200 * Update README.md * Update README.md * minor * bump version * revise test cases --------- Co-authored-by: xzyao-agent <agent@yao.sh> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
1 parent e2a62dc commit 8e610b7

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

79 files changed

+2648
-1359
lines changed
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
---
2+
name: New Model Architecture
3+
about: Propose a new model architecture to support
4+
title: Add support for model architecture [MODEL]
5+
labels: New Models
6+
assignees: ''
7+
8+
---
9+
10+
HuggingFace Repo for the Model Weights: []
11+
Existing Inference Code:[If any]

.github/workflows/ci.yml

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
name: CI
2+
3+
on:
4+
push:
5+
branches: [ dev, main ]
6+
pull_request:
7+
branches: [ dev, main ]
8+
9+
jobs:
10+
build:
11+
runs-on: rtx3090x1
12+
13+
steps:
14+
- uses: actions/checkout@v2
15+
16+
- name: Set up Docker Buildx
17+
uses: docker/setup-buildx-action@v3
18+
19+
- name: Build Docker image
20+
run: |
21+
bash docker/build_image.sh dev docker test
22+
23+
- name: Run tests in Docker container
24+
env:
25+
HF_TOKEN: ${{ secrets.HF_TOKEN }}
26+
run: |
27+
mkdir -p shared
28+
docker run -v "$PWD/shared:/shared" -v "$PWD/shared/.local:/scratchpad/.local" --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -e HF_TOKEN=$HF_TOKEN -e PYTHONPATH=/scratchpad ghcr.io/xiaozheyao/scratchpad:devdev-x86_64 /bin/bash -c "cd /scratchpad; python -m pytest --cache-clear --cov=scratchpad tests/ > /shared/pytest-coverage.txt"
29+
30+
- name: Comment coverage
31+
uses: coroo/pytest-coverage-commentator@v1.0.2
32+
with:
33+
pytest-coverage: shared/pytest-coverage.txt

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,3 +7,4 @@ pyrightconfig.json
77
.zed
88
.data
99
*.ipynb
10+
.coverage*

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
# Scratchpad
22

3-
> Adaptive LLM Inference
3+
This is an experimental LLM serving system, forked and built on top of [SGLang SRT](https://github.com/sgl-project/sglang/tree/main/python/sglang/srt), and is used to support [SwissAI Model Serving](https://fmapi.swissai.cscs.ch/).

docker/Dockerfile.aarch64-cuda

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,11 @@ RUN apt update && apt upgrade -y
1717
WORKDIR /scratchpad
1818

1919
COPY . /scratchpad
20+
RUN pip install pybind11 pre-commit && \
21+
git clone --recursive https://github.com/mlc-ai/xgrammar.git && cd xgrammar && \
22+
pre-commit install && \
23+
mkdir build && cd build && cmake .. -G Ninja && ninja && \
24+
cd ../python && python3 -m pip install .
2025

2126
RUN git clone -b v0.1.6 https://github.com/flashinfer-ai/flashinfer.git --recursive && \
2227
cd flashinfer/python && \
@@ -27,6 +32,7 @@ RUN git clone https://github.com/eth-easl/triteia.git && \
2732
git submodule update --init --recursive && \
2833
pip install -e .
2934
RUN pip install -r meta/requirements-extra.txt
35+
RUN pip install -r meta/requirements-dev.txt
3036
RUN pip install .
3137

3238
# todo(xiaozhe): figure out why pynvml is installed in the first place. We should use nvidia-ml-py instead.

docker/Dockerfile.x86_64-cuda

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
FROM nvcr.io/nvidia/pytorch:24.05-py3 AS base
1+
FROM nvcr.io/nvidia/pytorch:24.09-py3 AS base
22

33
LABEL org.opencontainers.image.source=https://github.com/xiaozheyao/Scratchpad
44
LABEL org.opencontainers.image.description="Scratchpad: Adaptive Serving of LMs"
@@ -25,5 +25,6 @@ RUN git clone https://github.com/eth-easl/triteia.git && \
2525
git submodule update --init --recursive && \
2626
pip install -e .
2727
RUN pip install -r meta/requirements-extra.txt
28+
RUN pip install -r meta/requirements-dev.txt
2829
RUN pip install .
2930
RUN pip uninstall pynvml -y

docker/build_image.sh

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,16 @@
22
arch=$(uname -m)
33
version=$1
44
buildtool=$2
5+
anno=$3
56
# if version is not provided, raise error
67
if [ -z "$version" ]; then
78
echo "Please provide version number"
89
exit 1
910
fi
1011
echo "Building image for $arch, version $version"
1112
DOCKER_BUILDKIT=0 $buildtool build -f docker/Dockerfile.$arch-cuda . -t ghcr.io/xiaozheyao/scratchpad:${version}dev-$arch --build-arg ARCH=$arch
12-
$buildtool push ghcr.io/xiaozheyao/scratchpad:${version}dev-$arch
13+
14+
if [ "$anno" = "upload" ]; then
15+
echo "Uploading image to ghcr.io"
16+
$buildtool push ghcr.io/xiaozheyao/scratchpad:${version}dev-$arch
17+
fi

docs/examples/router.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
from scratchpad.extensions.shepherd import Router, Route
2+
from scratchpad.utils.client import LLMEncoder
3+
4+
encoder = LLMEncoder(
5+
model="meta-llama/Llama-3.2-1B-Instruct",
6+
)
7+
routes = [
8+
Route(
9+
name="chat",
10+
utterances=["Hi", "How are you?"],
11+
model_preferences=["meta-llama/Llama-3.2-1B-Instruct"],
12+
),
13+
Route(
14+
name="math",
15+
utterances=["the solution to x^2=16 is "],
16+
model_preferences=["meta-llama/Llama-3.2-70B-Instruct"],
17+
),
18+
]

docs/examples/tool_calls.py

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
import os
2+
import openai
3+
4+
tools = [
5+
{
6+
"type": "function",
7+
"function": {
8+
"name": "get_current_weather",
9+
"description": "Get the current weather in a given location",
10+
"parameters": {
11+
"type": "object",
12+
"properties": {
13+
"location": {
14+
"type": "string",
15+
"description": "The city and state, e.g. San Francisco, CA",
16+
},
17+
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
18+
},
19+
"required": ["location"],
20+
},
21+
},
22+
}
23+
]
24+
messages = [{"role": "user", "content": "What's the weather in Boston today?"}]
25+
26+
client = openai.Client(
27+
base_url=os.environ.get(f"RC_API_BASE"), api_key=os.environ.get(f"RC_API_KEY")
28+
)
29+
30+
response = client.chat.completions.create(
31+
model="meta-llama/Llama-3.1-70B-Instruct",
32+
messages=messages,
33+
temperature=0.8,
34+
top_p=0.8,
35+
stream=False,
36+
tools=tools,
37+
)
38+
print(response)

meta/requirements-dev.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,3 +7,7 @@ sphinx-copybutton
77
sphinx-rtd-theme
88
pytest
99
pre-commit
10+
pytest-cov
11+
coverage
12+
matplotlib
13+
seaborn

0 commit comments

Comments
 (0)