-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
feature: warm-up cache #14924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feature: warm-up cache #14924
Conversation
The latest updates on your projects. Learn more about Vercel for GitHub.
|
2d9c075
to
c930476
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add details on perf improvement this PR showed on testing @AlexsanderHamir ?
litellm/proxy/utils.py
Outdated
response = await self.db.litellm_endusertable.find_many( | ||
where={"budget_id": {"in": budget_id_list}} | ||
where={"budget_id": {"in": budget_id_list}}, | ||
order={"litellm_budget_table": {"created_at": "desc"}}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why change this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a mistake, thank you for catching it.
litellm/proxy/proxy_server.py
Outdated
) | ||
|
||
### PRELOAD USERS INTO CACHE ### | ||
ProxyStartupEvent._start_user_preload_background_task( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be an asyncio.create_task, so it does not block startup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reviewed
litellm/proxy/proxy_server.py
Outdated
|
||
### PRELOAD USERS INTO CACHE ### | ||
if prisma_client is not None and general_settings is not None: | ||
preload_limit = general_settings.get("preload_users_limit", 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlexsanderHamir we want this to run by default. the current code requires the user to opt into this by setting it on general_settings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reviewed
litellm/proxy/_types.py
Outdated
description="[DEPRECATED] Use 'user_header_mappings' instead. When set, the header value is treated as the end user id unless overridden by user_header_mappings.", | ||
) | ||
user_header_mappings: Optional[List[UserHeaderMapping]] = None | ||
preload_users_limit: Optional[int] = Field( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need this
litellm/constants.py
Outdated
MAX_SIZE_PER_ITEM_IN_MEMORY_CACHE_IN_KB = int( | ||
os.getenv("MAX_SIZE_PER_ITEM_IN_MEMORY_CACHE_IN_KB", 1024) | ||
) # 1MB = 1024KB | ||
_DEFAULT_CACHE_WARMUP_USERS = 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call it DEFAULT_CACHE_WARMUP_USERS
and allow it to be overrideable using env vars
litellm/proxy/proxy_server.py
Outdated
if prisma_client is not None: | ||
default_preload_limit = _DEFAULT_CACHE_WARMUP_USERS | ||
preload_limit = ( | ||
general_settings.get("preload_users_limit", default_preload_limit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for general settings just use the constant
Title
Pre-load Users
Relevant issues
Pre-Submission checklist
Please complete all items before asking a LiteLLM maintainer to review your PR
tests/litellm/
directory, Adding at least 1 test is a hard requirement - see detailsmake test-unit
Type
🆕 New Feature
Changes
If a user already has a database and starts or restarts the proxy server, they can now choose to load the most recent users into memory.
Observations
This feature has space to be expended upon, maybe loading the most active users instead of the most recent
Performance Improvements
With Cache Warmup
Without Cache Warmup