-
Notifications
You must be signed in to change notification settings - Fork 484
Description
Is your feature request related to a problem? Please describe.
Thanks for providing such an amazing piece of work, quickwit provides everything(almost) we need for our platform.
Our workload pattern is very similar to what is described in #5445 , at a much smaller scale. Currently we have 11 indexes, split sized-wise, 2 indexes range from 100T to 150T, 1 index is at about 20TB, others are well below 1TB.
The LRU cache strategy is very brittle against "big scans" that runs every now and then(less than 10 times every day). Some work( #5469 ) have been done to support LFU strategy which might work, but it still lacks flexibility.
In our case ,caching is not for performance, quickwit with no disk cache is blazing fast, which is where quickwit's engineering truly shines. Long range queries with term conditions (trace_id = xxx) can not be effectively cached anyways, downloading all splits to local disk won't help.
The actual value of cache for us is that s3 requests are greatly reduced for repeated data queries, which saves money and makes some pattern economically viable(100+TPS read on last x days data).
Describe the solution you'd like
To mitigate the cache churn issue, I would like quickwit to support following features
- support customized cache strategy for each index, instead of the whole cluster(also mentioned in Tunable Cache Eviction Policies #5445 ). For most write-heavy, read-never workloads(yep, logs), user is not very sensitive to latency, we can simply disable cache, which saves tons of disk space.
- support time range condition for cache fetching. New configuration "cache_within" can be specified for each index, only split with in the time range will be downloaded. For example, if someone query an index with cache_within set to 7d(7 days), only split relative to now is less than 7 days old will be downloaded and cached, other split simply stays in object storage.
Describe alternatives you've considered
For feature request 1, we evaluated a potential solution that use 2 searcher cluster to handle 2 groups of index(with cache/ w.o cache). It is not hard to play with some logic in the http proxy to quickwit since we also have to do authentication anyways.
For feature request 2, we believe there is no alternative.
Additional context
Out of curiosity, what's the plan for this project? Are you willing to take contributions? If it is ok, we would like to try working on this issue.