Skip to content

Many locations are listed repeatedly in the search index #2413

@LilithHafner

Description

@LilithHafner

Examining the search index on https://docs.julialang.org/en/v1.11-dev/#, I noticed that many items are listed multiple times in the search index under the same category, location, page, and title (though with different text).

d = documenterSearchIndex.docs; d.length
10083
all_but_text = d.map(function f(dd) {return dd.category + dd.location + dd.page + dd.title;}); all_but_text.length
10083
new Set(all_but_text).size
3854
all_incl_text = d.map(function f(dd) {return dd.category + dd.location + dd.page + dd.title + dd.text;}); all_incl_text.length
10083
new Set(all_incl_text).size
10046

I imagine that aggregation at index-creation time will improve runtime performance slightly without much alteration to result ordering.

One way to aggregate these semi-duplicates is to concatenate their texts.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions