Skip to content

Conversation

@aunjgr
Copy link
Contributor

@aunjgr aunjgr commented Nov 12, 2025

User description

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #22832

What this PR does / why we need it:

If a table has vector index with distance type l2, queries with l2sq distance function should be able to use this index.


PR Type

Enhancement


Description

  • Enable L2 index usage for L2sq distance queries

  • Store original distance function name in index configuration

  • Transform distances based on original function type

  • Support distance metric conversion between L2 and L2sq


Diagram Walkthrough

flowchart LR
  Query["Query with L2sq distance"] -->|Store orig_func_name| Config["Index Table Config"]
  Config -->|Pass to search| Search["HNSW/IVFFlat Search"]
  Search -->|Transform distance| Transform["Distance Transform"]
  Transform -->|Convert L2sq to L2| Result["Final Results"]
Loading

File Walkthrough

Relevant files
Enhancement
apply_indices_hnsw.go
Add original function name to HNSW config                               

pkg/sql/plan/apply_indices_hnsw.go

  • Store original distance function name in origFuncName variable
  • Pass orig_func_name parameter to HNSW table function configuration
  • Enable distance function name propagation to search execution
+5/-3     
apply_indices_ivfflat.go
Add original function name to IVFFlat config                         

pkg/sql/plan/apply_indices_ivfflat.go

  • Store original distance function name in origFuncName variable
  • Pass orig_func_name parameter to IVFFlat table function configuration
  • Add debug logging for unsupported distance functions
  • Enable distance function name propagation to search execution
+6/-3     
search.go
Use original function name for distance transformation     

pkg/vectorindex/hnsw/search.go

  • Use DistFuncNameToMetricType mapping with original function name
  • Transform distances based on original query distance function type
  • Update UpdateConfig to preserve original function name
+2/-2     
search.go
Use original function name for distance transformation     

pkg/vectorindex/ivfflat/search.go

  • Use DistFuncNameToMetricType mapping with original function name
  • Transform distances based on original query distance function type
  • Update UpdateConfig to preserve original function name
+2/-2     
types.go
Support L2sq to L2 distance transformation                             

pkg/vectorindex/metric/types.go

  • Map L2sq distance function to L2 index operation type
  • Update DistanceTransformHnsw to accept MetricType instead of string
  • Update DistanceTransformIvfflat to accept MetricType parameters
  • Transform L2sq query results to L2 distance when index uses L2sq
    metric
+7/-7     
types.go
Add original function name to config struct                           

pkg/vectorindex/types.go

  • Add OrigFuncName field to IndexTableConfig struct
  • Store original distance function name from query in configuration
  • Enable passing original function name through index search pipeline
+1/-0     

@qodo-merge-pro
Copy link

qodo-merge-pro bot commented Nov 12, 2025

You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Stdout logging: Writing "IVFFlat: Unsupported distance function" to stdout with fmt.Println is
not robust error handling and lacks actionable context and proper logging.

Referred Code
origFuncName := distFnExpr.Func.ObjName
if opType != metric.DistFuncOpTypes[origFuncName] {
	fmt.Println("IVFFlat: Unsupported distance function")
	return nodeID, nil

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status:
Insecure logging: Using fmt.Println for operational messages bypasses structured logging and log levels,
reducing observability and control required for secure logging practices.

Referred Code
origFuncName := distFnExpr.Func.ObjName
if opType != metric.DistFuncOpTypes[origFuncName] {
	fmt.Println("IVFFlat: Unsupported distance function")
	return nodeID, nil

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Missing auditing: The new indexing selection path and configuration emission add decision logic and
parameters without any accompanying audit logs of the action, user, or outcome.

Referred Code
if err != nil {
	return nodeID, nil
}
opType, err := opTypeAst.StrictString()
if err != nil {
	return nodeID, nil
}

origFuncName := distFnExpr.Func.ObjName
if opType != metric.DistFuncOpTypes[origFuncName] {
	fmt.Println("IVFFlat: Unsupported distance function")
	return nodeID, nil

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Config propagation: Newly added "orig_func_name" is propagated from parsed expressions into JSON
config and used downstream without visible validation in the diff, which may require
confirmation that upstream validation/sanitization exists.

Referred Code
tblCfgStr := fmt.Sprintf(`{"db": "%s", "src": "%s", "metadata":"%s", "index":"%s", "threads_search": %d, "orig_func_name": "%s"}`,
	scanNode.ObjRef.SchemaName,
	scanNode.TableDef.Name,
	metaDef.IndexTableName,
	idxDef.IndexTableName,
	nThread.(int64),
	origFuncName)

Learn more about managing compliance generic rules or creating your own custom rules

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-merge-pro
Copy link

qodo-merge-pro bot commented Nov 12, 2025

You are nearing your monthly Qodo Merge usage quota. For more information, please visit here.

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Refactor how query context is passed

Instead of passing the original query function name via a JSON string and
IndexTableConfig, pass it through a dedicated runtime parameter structure like
RuntimeConfig. This separates query-specific context from static index
configuration, leading to a more robust and extensible design.

Examples:

pkg/vectorindex/types.go [57]
	OrigFuncName  string `json:"orig_func_name"`
pkg/sql/plan/apply_indices_hnsw.go [97-103]
	tblCfgStr := fmt.Sprintf(`{"db": "%s", "src": "%s", "metadata":"%s", "index":"%s", "threads_search": %d, "orig_func_name": "%s"}`,
		scanNode.ObjRef.SchemaName,
		scanNode.TableDef.Name,
		metaDef.IndexTableName,
		idxDef.IndexTableName,
		nThread.(int64),
		origFuncName)

Solution Walkthrough:

Before:

// In pkg/sql/plan/apply_indices_*.go
origFuncName := distFnExpr.Func.ObjName
tblCfgStr := fmt.Sprintf(`{..., "orig_func_name": "%s"}`, ..., origFuncName)
// ... pass tblCfgStr to table function

// In pkg/vectorindex/types.go
type IndexTableConfig struct {
    // ... other fields
    OrigFuncName  string `json:"orig_func_name"`
}

// In pkg/vectorindex/hnsw/search.go
func (s *HnswSearch[T]) Search(..., rt vectorindex.RuntimeConfig) {
    // ...
    origMetricType := metric.DistFuncNameToMetricType[s.Tblcfg.OrigFuncName]
    sr.Distance = metric.DistanceTransformHnsw(sr.Distance, origMetricType, ...)
    // ...
}

After:

// In pkg/sql/plan/apply_indices_*.go
origFuncName := distFnExpr.Func.ObjName
// ...
// Set the original function name in the runtime config
// which is passed to the search function.
// The JSON config string is no longer modified.

// In pkg/vectorindex/types.go
type RuntimeConfig struct {
    Limit             uint
    // ... other fields
    OrigFuncName      string
}
// IndexTableConfig is reverted to its original state.

// In pkg/vectorindex/hnsw/search.go
func (s *HnswSearch[T]) Search(..., rt vectorindex.RuntimeConfig) {
    origMetricType := metric.DistFuncNameToMetricType[rt.OrigFuncName]
    sr.Distance = metric.DistanceTransformHnsw(sr.Distance, origMetricType, ...)
}
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a design flaw in passing query-time context (OrigFuncName) via a static configuration structure (IndexTableConfig) and proposes a more robust alternative using RuntimeConfig, which is the idiomatic place for such parameters.

Medium
Possible issue
Handle missing distance transformation case

Update DistanceTransformHnsw to handle the case where an L2-squared distance
query is performed on an L2 distance index by squaring the returned distance.

pkg/vectorindex/metric/types.go [123-129]

 func DistanceTransformHnsw(dist float64, origMetricType MetricType, metricType usearch.Metric) float64 {
 	if origMetricType == Metric_L2Distance && metricType == usearch.L2sq {
 		// metric is l2sq but origin is l2_distance
 		return math.Sqrt(dist)
+	} else if origMetricType == Metric_L2sqDistance && metricType == usearch.L2 {
+		// metric is l2 but origin is l2_distance_sq
+		return dist * dist
 	}
 	return dist
 }
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a missing case in the new distance transformation logic, which would lead to incorrect distance values for certain query and index combinations, thus fixing a functional bug.

Medium
  • Update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/feature Review effort 2/5 size/S Denotes a PR that changes [10,99] lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants