-
Notifications
You must be signed in to change notification settings - Fork 26
[WIP] Implement 'Did You Mean' suggestions for schema validation #3999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Implement 'Did You Mean' suggestions for schema validation #3999
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
@copilot Work |
Update test cases to reflect actual Levenshtein distance behavior: - Exact matches are skipped (distance 0) - Tests now use typos that are actually close to valid field names - Updated expected messages to match new suggestion format Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Completed! The fuzzy matching implementation is now fully working with all tests passing. Changes made (commit 6451e75):
Key features:
Example output: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements "Did You Mean" suggestions for schema validation errors using the Levenshtein distance algorithm. When users make typos in workflow frontmatter fields (e.g., permisions → permissions), the compiler now suggests the correct field names.
Key Changes:
- Replaced heuristic-based string matching with proper Levenshtein distance algorithm in
pkg/parser/schema.go - Updated suggestion messages to be more helpful (single vs multiple suggestions)
- Added comprehensive test coverage for fuzzy matching logic
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| pkg/workflow/schema_fuzzy_match.go | New file with Levenshtein distance implementation and helper functions (note: appears to contain unused code not integrated into production) |
| pkg/workflow/schema_fuzzy_match_test.go | Comprehensive tests for fuzzy matching logic with 39+ test cases |
| pkg/parser/schema.go | Integration of Levenshtein distance into existing schema validation with updated findClosestMatches() and generateFieldSuggestions() |
| pkg/parser/schema_suggestions_test.go | Updated tests to reflect Levenshtein distance behavior instead of heuristic matching |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pkg/workflow/schema_fuzzy_match.go
Outdated
| func levenshteinDistance(a, b string) int { | ||
| aLen := len(a) | ||
| bLen := len(b) | ||
|
|
||
| // Early exit for empty strings | ||
| if aLen == 0 { | ||
| return bLen | ||
| } | ||
| if bLen == 0 { | ||
| return aLen | ||
| } | ||
|
|
||
| // Create a 2D matrix for dynamic programming | ||
| // We only need the previous row, so we can optimize space | ||
| previousRow := make([]int, bLen+1) | ||
| currentRow := make([]int, bLen+1) | ||
|
|
||
| // Initialize the first row (distance from empty string) | ||
| for i := 0; i <= bLen; i++ { | ||
| previousRow[i] = i | ||
| } | ||
|
|
||
| // Calculate distances for each character in string a | ||
| for i := 1; i <= aLen; i++ { | ||
| currentRow[0] = i // Distance from empty string | ||
|
|
||
| for j := 1; j <= bLen; j++ { | ||
| // Cost of substitution (0 if characters match, 1 otherwise) | ||
| cost := 1 | ||
| if a[i-1] == b[j-1] { | ||
| cost = 0 | ||
| } | ||
|
|
||
| // Minimum of: | ||
| // - Deletion: previousRow[j] + 1 | ||
| // - Insertion: currentRow[j-1] + 1 | ||
| // - Substitution: previousRow[j-1] + cost | ||
| currentRow[j] = min3( | ||
| previousRow[j]+1, // deletion | ||
| currentRow[j-1]+1, // insertion | ||
| previousRow[j-1]+cost, // substitution | ||
| ) | ||
| } | ||
|
|
||
| // Swap rows for next iteration | ||
| previousRow, currentRow = currentRow, previousRow | ||
| } | ||
|
|
||
| return previousRow[bLen] | ||
| } |
Copilot
AI
Nov 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The levenshteinDistance function is duplicated between pkg/workflow/schema_fuzzy_match.go and pkg/parser/schema.go. The implementations are nearly identical except for the minimum calculation (lines 49-53 use min3() while the parser version uses nested min() calls).
According to the coding guidelines (CodingGuidelineID 1000002), code should avoid duplication. Consider extracting this function to a shared utility package (e.g., pkg/stringutil or similar) that both packages can import.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
@copilot remove duplicate lev distance algo |
Delete pkg/workflow/schema_fuzzy_match.go and associated tests as they contain unused code that duplicates the production implementation in pkg/parser/schema.go. The actual fuzzy matching logic is integrated directly into pkg/parser/schema.go and is working correctly. All tests pass. Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Removed duplicate Levenshtein distance algorithm (commit a8a2a34). Deleted The fuzzy matching feature continues to work correctly with the single implementation in the parser package. All tests pass. |
|
@copilot Lint |
- Run go fmt to fix formatting (removed trailing whitespace) - Remove unused abs() function that was left from old heuristic-based matching - All linting checks now pass Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
This commit updates the documentation based on features merged in the last 24 hours. ## Features Documented ### 1. Cross-repository --repo flag (PR #4007) - Added --repo option to enable, disable, and logs commands - Enables workflow management operations across different repositories - Updated CLI reference with examples and option descriptions ### 2. Workflow description extraction (PR #4008) - Documents that add and trial commands now display workflow descriptions - Descriptions are extracted from frontmatter description field - Provides better context about workflow purpose when adding or testing ### 3. Import cache for offline compilation (PR #3981) - Remote imports are automatically cached in .github/aw/imports/ - Cache stores imports by commit SHA for efficient reuse - Enables offline compilation once imports have been downloaded - Updated CLI, imports reference, and packaging guides ### 4. "Did You Mean" schema validation suggestions (PR #3999) - Compiler suggests correct field names for typos using fuzzy matching - Based on Levenshtein distance algorithm - Added tip callout and new error documentation section - Includes examples of common typos detected ## Files Modified - docs/src/content/docs/setup/cli.md - docs/src/content/docs/reference/imports.md - docs/src/content/docs/guides/packaging-imports.md - docs/src/content/docs/troubleshooting/errors.md ## Related PRs - #4007 - Add --repo options to more commands - #4008 - Add workflow description extraction feature - #3981 - Add import cache for offline workflow compilation - #3999 - Implement 'Did You Mean' suggestions for schema validation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Implementation: "Did You Mean" Suggestions for Schema Validation
Objective
Implement fuzzy string matching to suggest correct field names when users make typos in workflow frontmatter fields using Levenshtein distance algorithm.
✅ Completed
pkg/parser/schema.gofindClosestMatches()to use Levenshtein distance instead of heuristicsgenerateFieldSuggestions()for better message formattingpkg/workflow/packagemake test-unit)make lint)Implementation Details
Levenshtein Distance Algorithm:
pkg/parser/schema.goSuggestion Logic:
Integration:
pkg/parser/schema.goExamples
Single typo:
Multiple typos:
Test Coverage
Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.