Skip to content

Update tantivy version #5863

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

fulmicoton-dd
Copy link
Collaborator

In quickwit-oss/tantivy#2679 we modified the prototype of the function that expose fields information.

The new method expose the size used per field for each tantivy datastructure.

This PR updates quickwit to:

  • compile
  • limit the number of fields exported in the packager to the 1k most prominent fields.

This is also a preliminary solution to address
#5832

In quickwit-oss/tantivy#2679 we modified the prototype of the function that expose fields information.

The new method expose the size used per field for each tantivy datastructure.

This PR updates quickwit to:
- compile
- limit the number of fields exported in the packager to the 1k most prominent fields.

This is also a preliminary solution to address
#5832
@fulmicoton-dd fulmicoton-dd requested a review from PSeitz August 13, 2025 12:49
let num_fields = fields_metadata.len();

let max_packaged_field_caps =
quickwit_common::get_from_env(QW_MAX_PACKAGED_FIELDS_CAPS_ENV_KEY, 1_000);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
quickwit_common::get_from_env(QW_MAX_PACKAGED_FIELDS_CAPS_ENV_KEY, 1_000);
quickwit_common::get_from_env(QW_MAX_PACKAGED_FIELDS_CAPS_ENV_KEY, QW_MAX_PACKAGED_FIELDS_CAPS_ENV_DEFAULT);

use tokio::runtime::Handle;
use tracing::{debug, info, instrument, warn};

const LIMIT_PACKAGED_FIELDS: usize = 1_000;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const LIMIT_PACKAGED_FIELDS: usize = 1_000;
const QW_MAX_PACKAGED_FIELDS_CAPS_ENV_DEFAULT: usize = 1_000;

);
fields_metadata
.into_iter()
.k_largest_by_key(LIMIT_PACKAGED_FIELDS, total_num_bytes)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.k_largest_by_key(LIMIT_PACKAGED_FIELDS, total_num_bytes)
.k_largest_by_key(max_packaged_field_caps, total_num_bytes)

@@ -268,6 +272,41 @@ fn try_extract_terms(
Ok(terms)
}

fn total_num_bytes(field_metadata: &FieldMetadata) -> u64 {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list fields also returns dynamic fields in a JSON field. How does total_num_bytes work with that? Does it return 0 or the size of the parent field?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants