-
Notifications
You must be signed in to change notification settings - Fork 481
Update tantivy version #5863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Update tantivy version #5863
Conversation
In quickwit-oss/tantivy#2679 we modified the prototype of the function that expose fields information. The new method expose the size used per field for each tantivy datastructure. This PR updates quickwit to: - compile - limit the number of fields exported in the packager to the 1k most prominent fields. This is also a preliminary solution to address #5832
let num_fields = fields_metadata.len(); | ||
|
||
let max_packaged_field_caps = | ||
quickwit_common::get_from_env(QW_MAX_PACKAGED_FIELDS_CAPS_ENV_KEY, 1_000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quickwit_common::get_from_env(QW_MAX_PACKAGED_FIELDS_CAPS_ENV_KEY, 1_000); | |
quickwit_common::get_from_env(QW_MAX_PACKAGED_FIELDS_CAPS_ENV_KEY, QW_MAX_PACKAGED_FIELDS_CAPS_ENV_DEFAULT); |
use tokio::runtime::Handle; | ||
use tracing::{debug, info, instrument, warn}; | ||
|
||
const LIMIT_PACKAGED_FIELDS: usize = 1_000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const LIMIT_PACKAGED_FIELDS: usize = 1_000; | |
const QW_MAX_PACKAGED_FIELDS_CAPS_ENV_DEFAULT: usize = 1_000; |
); | ||
fields_metadata | ||
.into_iter() | ||
.k_largest_by_key(LIMIT_PACKAGED_FIELDS, total_num_bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.k_largest_by_key(LIMIT_PACKAGED_FIELDS, total_num_bytes) | |
.k_largest_by_key(max_packaged_field_caps, total_num_bytes) |
@@ -268,6 +272,41 @@ fn try_extract_terms( | |||
Ok(terms) | |||
} | |||
|
|||
fn total_num_bytes(field_metadata: &FieldMetadata) -> u64 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
list fields also returns dynamic fields in a JSON field. How does total_num_bytes
work with that? Does it return 0 or the size of the parent field?
In quickwit-oss/tantivy#2679 we modified the prototype of the function that expose fields information.
The new method expose the size used per field for each tantivy datastructure.
This PR updates quickwit to:
This is also a preliminary solution to address
#5832