Skip to content

batched cosine / euclidean procedure for more efficient computation of vector similarities #4447

Open
@jexp

Description

@jexp

apoc.ml/algo.cosine(Similarity) or sth like that.

Parameters:

  • list of nodes
  • property name
  • target embedding
  • top-k
  • threshold

Efficient implementation without multiple conversions of data

  • Kernel API for property access?
  • SIMD / Java Vector API ?
  • Early filtering and stop when top-k is reached or threshold not passed
  • Stream results out

Compare large scale performance, with say 1000 / 10k nodes compared with the genai.vector.consine function.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions