Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v6
with:
go-version: "1.23.2"
- name: Validate contributors
go-version: "1.24"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was having weird issues with running Golang CI locally, and had to match the Go version.

- name: Validate README
run: go build ./cmd/readmevalidation && ./readmevalidation
- name: Remove build file artifact
run: rm ./readmevalidation
1 change: 0 additions & 1 deletion .github/workflows/golangci-lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ on:
push:
branches:
- main
- master
pull_request:

permissions:
Expand Down
78 changes: 77 additions & 1 deletion cmd/readmevalidation/codermodules.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,84 @@ package main
import (
"bufio"
"context"
"regexp"
"strings"

"golang.org/x/xerrors"
)

var (
// Matches Terraform source lines with registry.coder.com URLs
// Pattern: source = "registry.coder.com/namespace/module/coder"
terraformSourceRe = regexp.MustCompile(`^\s*source\s*=\s*"` + registryDomain + `/([a-zA-Z0-9-]+)/([a-zA-Z0-9-]+)/coder"`)
)

func validateModuleSourceURL(rm coderResourceReadme) []error {
var errs []error

// Skip validation if we couldn't parse namespace/resourceName from path
if rm.namespace == "" || rm.resourceName == "" {
return []error{xerrors.Errorf("invalid module path format: %s", rm.filePath)}
}

expectedSource := registryDomain + "/" + rm.namespace + "/" + rm.resourceName + "/coder"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: since this variable isn't used for a while, we can move it down, closer to where it's used


trimmed := strings.TrimSpace(rm.body)
foundCorrectSource := false
Copy link
Member

@Parkreiner Parkreiner Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: I think there's something we can salvage from this comment, but I realized in the second comment in the chain that it doesn't do everything we want

I don't know if this variable makes sense. With how it's set up right now, if we have at least one correct source in the README file, we'll automatically ignore all other incorrect sources. And from a state modeling perspective, the variable also feels redundant, when a single slice should be able to give us all the info we need

What makes more sense to me is to:

  1. Have incorrectSources as the main validation state
  2. Instead of having the actualSource == expectedSource check, only have a actualSource != expectedSource check. If that triggers, push the incorrect source to the incorrectSources slice. If things match, we'll do nothing, and let the code fall through to the rest of the loop
  3. Once the main loop is done and we're done processing the lines, check if the slice is not empty. If it's not, return an error with a list of all sources that are incorrect

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, now that I'm thinking over the code more, I think I understand what the old code was trying to do, and why my suggestion doesn't actually work

I'm not super up to speed on our Terraform conventions, but if a module has a name like module "blah", would it be expected that the source URL should always have "blah" in it, too? If so, I think we'd need to check the name of each module block, and compare that against the source URL line

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but if a module has a name like module "blah", would it be expected that the source URL should always have "blah" in it, too?

No it can be different. We just use the same name for convenience.

var incorrectSources []string
isInsideTerraform := false

lineScanner := bufio.NewScanner(strings.NewReader(trimmed))
for lineScanner.Scan() {
nextLine := lineScanner.Text()

if strings.HasPrefix(nextLine, "```") {
if strings.HasPrefix(nextLine, "```tf") {
isInsideTerraform = true
continue
}
Comment on lines +38 to +41
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an edge case, but I feel like we want to handle cases where someone accidentally nests Terraform snippets inside each other like this:

```tf
```tf

if isInsideTerraform {
// End of current terraform block, continue to look for more
isInsideTerraform = false
}
continue
}

if !isInsideTerraform {
continue
}

// Check for source line in terraform blocks
if matches := terraformSourceRe.FindStringSubmatch(nextLine); matches != nil {
actualNamespace := matches[1]
actualModule := matches[2]
actualSource := registryDomain + "/" + actualNamespace + "/" + actualModule + "/coder"

if actualSource == expectedSource {
foundCorrectSource = true
} else {
// Collect incorrect sources but don't return immediately
incorrectSources = append(incorrectSources, actualSource)
}
}
}

// If we found the correct source, ignore any incorrect ones
if foundCorrectSource {
return nil
}

// If we found incorrect sources but no correct one, report the first incorrect source
if len(incorrectSources) > 0 {
errs = append(errs, xerrors.Errorf("incorrect source URL format: found %q, expected %q", incorrectSources[0], expectedSource))
return errs
}
Comment on lines +73 to +77
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why we wouldn't want to report all incorrect sources?


// If we found no sources at all
errs = append(errs, xerrors.Errorf("did not find correct source URL %q in any Terraform code block", expectedSource))
return errs
}

func validateCoderModuleReadmeBody(body string) []error {
var errs []error

Expand Down Expand Up @@ -94,6 +167,9 @@ func validateCoderModuleReadme(rm coderResourceReadme) []error {
for _, err := range validateCoderModuleReadmeBody(rm.body) {
errs = append(errs, addFilePathToError(rm.filePath, err))
}
for _, err := range validateModuleSourceURL(rm) {
errs = append(errs, addFilePathToError(rm.filePath, err))
}
for _, err := range validateResourceGfmAlerts(rm.body) {
errs = append(errs, addFilePathToError(rm.filePath, err))
}
Expand Down Expand Up @@ -143,4 +219,4 @@ func validateAllCoderModules() error {
}
logger.Info(context.Background(), "all relative URLs for READMEs are valid", "resource_type", resourceType)
return nil
}
}
170 changes: 170 additions & 0 deletions cmd/readmevalidation/codermodules_test.go
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests generated by gemini CLI

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming we still want to process all the blocks in a file, I think it'd be really good to have some test cases that include 2+ blocks

Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,70 @@ package main

import (
_ "embed"
"strings"
"testing"
)

//go:embed testSamples/sampleReadmeBody.md
var testBody string

// Test bodies extracted for better readability
var (
validModuleBody = `# Test Module

` + "```tf\n" + `module "test-module" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'm just now realizing how the sample MD contents are formatted, and I think I'd prefer for each one to be a single raw string. All the + signs are a bit hard to read in the GitHub UI, and this is a case where all the spacing and lines do matter

source = "registry.coder.com/test-namespace/test-module/coder"
version = "1.0.0"
agent_id = coder_agent.example.id
}
` + "```\n"

wrongNamespaceBody = `# Test Module

` + "```tf\n" + `module "test-module" {
source = "registry.coder.com/wrong-namespace/test-module/coder"
version = "1.0.0"
agent_id = coder_agent.example.id
}
` + "```\n"

missingSourceBody = `# Test Module

` + "```tf\n" + `module "test-module" {
version = "1.0.0"
agent_id = coder_agent.example.id
}
` + "```\n"

multipleBlocksValidBody = `# Test Module

` + "```tf\n" + `module "other-module" {
source = "registry.coder.com/other/module/coder"
version = "1.0.0"
}
` + "```\n" + `
` + "```tf\n" + `module "test-module" {
source = "registry.coder.com/test-namespace/test-module/coder"
version = "1.0.0"
agent_id = coder_agent.example.id
}
` + "```\n"

multipleBlocksInvalidBody = `# Test Module

` + "```tf\n" + `module "test-module" {
source = "registry.coder.com/wrong-namespace/test-module/coder"
version = "1.0.0"
}
` + "```\n" + `
` + "```tf\n" + `module "other-module" {
source = "registry.coder.com/another-wrong/test-module/coder"
version = "1.0.0"
agent_id = coder_agent.example.id
}
` + "```\n"
)

func TestValidateCoderResourceReadmeBody(t *testing.T) {
t.Parallel()

Expand All @@ -20,3 +78,115 @@ func TestValidateCoderResourceReadmeBody(t *testing.T) {
}
})
}

func TestValidateModuleSourceURL(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the test setup is pretty good. Just to account for the other comment I added, though, I feel like we should have one more test that validates what happens if you have a block with an incorrect body, and 2+ other blocks with correct bodies

t.Parallel()

t.Run("Valid source URL format", func(t *testing.T) {
t.Parallel()

rm := coderResourceReadme{
resourceType: "modules",
filePath: "registry/test-namespace/modules/test-module/README.md",
namespace: "test-namespace",
resourceName: "test-module",
body: validModuleBody,
}
errs := validateModuleSourceURL(rm)
if len(errs) != 0 {
t.Errorf("Expected no errors, got: %v", errs)
}
})

t.Run("Invalid source URL format - wrong namespace", func(t *testing.T) {
t.Parallel()

rm := coderResourceReadme{
resourceType: "modules",
filePath: "registry/test-namespace/modules/test-module/README.md",
namespace: "test-namespace",
resourceName: "test-module",
body: wrongNamespaceBody,
}
errs := validateModuleSourceURL(rm)
if len(errs) != 1 {
t.Errorf("Expected 1 error, got %d: %v", len(errs), errs)
}
if !strings.Contains(errs[0].Error(), "incorrect source URL format") {
t.Errorf("Expected source URL format error, got: %s", errs[0].Error())
}
})

t.Run("Missing source URL", func(t *testing.T) {
t.Parallel()

rm := coderResourceReadme{
resourceType: "modules",
filePath: "registry/test-namespace/modules/test-module/README.md",
namespace: "test-namespace",
resourceName: "test-module",
body: missingSourceBody,
}
errs := validateModuleSourceURL(rm)
if len(errs) != 1 {
t.Errorf("Expected 1 error, got %d: %v", len(errs), errs)
}
if !strings.Contains(errs[0].Error(), "did not find correct source URL") {
t.Errorf("Expected missing source URL error, got: %s", errs[0].Error())
}
})

t.Run("Invalid file path format", func(t *testing.T) {
t.Parallel()

rm := coderResourceReadme{
resourceType: "modules",
filePath: "invalid/path/format",
namespace: "", // Empty because path parsing failed
resourceName: "", // Empty because path parsing failed
body: "# Test Module",
}
errs := validateModuleSourceURL(rm)
if len(errs) != 1 {
t.Errorf("Expected 1 error, got %d: %v", len(errs), errs)
}
if !strings.Contains(errs[0].Error(), "invalid module path format") {
t.Errorf("Expected path format error, got: %s", errs[0].Error())
}
})

t.Run("Multiple blocks with valid source in second block", func(t *testing.T) {
t.Parallel()

rm := coderResourceReadme{
resourceType: "modules",
filePath: "registry/test-namespace/modules/test-module/README.md",
namespace: "test-namespace",
resourceName: "test-module",
body: multipleBlocksValidBody,
}
errs := validateModuleSourceURL(rm)
if len(errs) != 0 {
t.Errorf("Expected no errors, got: %v", errs)
}
})

t.Run("Multiple blocks with incorrect source in second block", func(t *testing.T) {
t.Parallel()

rm := coderResourceReadme{
resourceType: "modules",
filePath: "registry/test-namespace/modules/test-module/README.md",
namespace: "test-namespace",
resourceName: "test-module",
body: multipleBlocksInvalidBody,
}
errs := validateModuleSourceURL(rm)
if len(errs) != 1 {
t.Errorf("Expected 1 error, got %d: %v", len(errs), errs)
}
if !strings.Contains(errs[0].Error(), "incorrect source URL format") {
t.Errorf("Expected source URL format error, got: %s", errs[0].Error())
}
})
}
26 changes: 20 additions & 6 deletions cmd/readmevalidation/coderresources.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,15 @@ var (
supportedResourceTypes = []string{"modules", "templates"}
operatingSystems = []string{"windows", "macos", "linux"}
gfmAlertTypes = []string{"NOTE", "IMPORTANT", "CAUTION", "WARNING", "TIP"}
registryDomain = "registry.coder.com"

// TODO: This is a holdover from the validation logic used by the Coder Modules repo. It gives us some assurance, but
// realistically, we probably want to parse any Terraform code snippets, and make some deeper guarantees about how it's
// structured. Just validating whether it *can* be parsed as Terraform would be a big improvement.
terraformVersionRe = regexp.MustCompile(`^\s*\bversion\s+=`)

// Matches the format "> [!INFO]". Deliberately using a broad pattern to catch formatting issues that can mess up
// the renderer for the Registry website
// the renderer for the Registry website.
gfmAlertRegex = regexp.MustCompile(`^>(\s*)\[!(\w+)\](\s*)(.*)`)
)

Expand All @@ -39,7 +40,7 @@ type coderResourceFrontmatter struct {
}

// A slice version of the struct tags from coderResourceFrontmatter. Might be worth using reflection to generate this
// list at runtime in the future, but this should be okay for now
// list at runtime in the future, but this should be okay for now.
var supportedCoderResourceStructKeys = []string{
"description", "icon", "display_name", "verified", "tags", "supported_os",
// TODO: This is an old, officially deprecated key from the archived coder/modules repo. We can remove this once we
Expand All @@ -53,6 +54,8 @@ var supportedCoderResourceStructKeys = []string{
type coderResourceReadme struct {
resourceType string
filePath string
namespace string
resourceName string
body string
frontmatter coderResourceFrontmatter
}
Expand Down Expand Up @@ -183,9 +186,20 @@ func parseCoderResourceReadme(resourceType string, rm readme) (coderResourceRead
return coderResourceReadme{}, []error{xerrors.Errorf("%q: failed to parse: %v", rm.filePath, err)}
}

// Extract namespace and resource name from file path
// Expected path format: registry/<namespace>/<resourceType>/<resource-name>/README.md
var namespace, resourceName string
parts := strings.Split(path.Clean(rm.filePath), "/")
if len(parts) >= 5 && parts[0] == "registry" && parts[2] == resourceType && parts[4] == "README.md" {
namespace = parts[1]
resourceName = parts[3]
}

return coderResourceReadme{
resourceType: resourceType,
filePath: rm.filePath,
namespace: namespace,
resourceName: resourceName,
body: body,
frontmatter: yml,
}, nil
Expand Down Expand Up @@ -315,15 +329,15 @@ func validateResourceGfmAlerts(readmeBody string) []error {
}

// Nested GFM alerts is such a weird mistake that it's probably not really safe to keep trying to process the
// rest of the content, so this will prevent any other validations from happening for the given line
// rest of the content, so this will prevent any other validations from happening for the given line.
if isInsideGfmQuotes {
errs = append(errs, errors.New("registry does not support nested GFM alerts"))
errs = append(errs, xerrors.New("registry does not support nested GFM alerts"))
continue
}

leadingWhitespace := currentMatch[1]
if len(leadingWhitespace) != 1 {
errs = append(errs, errors.New("GFM alerts must have one space between the '>' and the start of the GFM brackets"))
errs = append(errs, xerrors.New("GFM alerts must have one space between the '>' and the start of the GFM brackets"))
}
isInsideGfmQuotes = true

Expand All @@ -347,7 +361,7 @@ func validateResourceGfmAlerts(readmeBody string) []error {
}
}

if gfmAlertRegex.Match([]byte(sourceLine)) {
if gfmAlertRegex.MatchString(sourceLine) {
errs = append(errs, xerrors.Errorf("README has an incomplete GFM alert at the end of the file"))
}

Expand Down
Loading