-
Notifications
You must be signed in to change notification settings - Fork 194
Added Readme for gpt-oss MD #632
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
sargam-modak
wants to merge
1
commit into
oracle-samples:main
Choose a base branch
from
sargam-modak:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
# Model Deployment - GPT-OSS | ||
|
||
OpenAI has announced the release of [two open weight models](https://openai.com/index/introducing-gpt-oss/), gpt-oss-120b and gpt-oss-20b, their first open-weight language models since GPT‑2. According to OpenAI, their performance are on par or exceed OpenAI's internal models, and both models perform strongly on tool use, few-shot function calling, CoT reasoning and HealthBench. | ||
|
||
Here are the new OpenAI open weight models: | ||
|
||
* gpt-oss-120b — designed for production, general-purpose and high-reasoning use cases. The model has 117B parameters with 5.1B active parameters | ||
* gpt-oss-20b — designed for lower latency and local or specialized use cases. The model has 21B parameters with 3.6B active parameters | ||
|
||
Both models are now available in OCI Data Science AI Quick Actions. The models are cached in our service and readily available to be deployed and fine tuned, without the need for customers to bring in the model artifacts from external sites. By using AI Quick Actions, customers can leverage our service managed container with the latest vllm version that supports both of the models, eliminating the need to build or bring your own container for working with the models. | ||
|
||
 | ||
|
||
 | ||
|
||
|
||
## Deploying an LLM | ||
|
||
After picking a model from the model explorer, if the "Deploy Model" is enabled you can use this | ||
form to quickly deploy the model: | ||
|
||
 | ||
|
||
## Setting Environment Variable | ||
There are multiple shapes which support the deployment of the model. But if you are using shape other than H100 or H200 you are required to pass an extra environment variable while deploying the model. You can do so by: | ||
* go to show advanced section at the bottom of the form | ||
* go to Custom Environment Variables | ||
* Put 'VLLM_ATTENTION_BACKEND' as key | ||
* Put 'TRITON_ATTN_VLLM_V1' as value | ||
|
||
 | ||
|
||
Now you can deploy the model in shapes other than H100 or H200 | ||
|
||
Also to know more on model deployments you can refer [Model Deployment Tips](model-deployment-tips.md) page. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please link to the page which claims
their performance are on par or exceed OpenAI's internal models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The page that claims the performance are on par or exceed OpenAI's internal models is https://openai.com/index/introducing-gpt-oss/. The same one as the link for "two open weight models".