-
Notifications
You must be signed in to change notification settings - Fork 248
workflow/verify links #819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
||
- name: Link Checker | ||
id: lychee | ||
uses: lycheeverse/lychee-action@v2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if anyone watched scheduled jobs, for Cockpit we make issues. Owners will spot these issues but not sure if anyone else would.
Did you try running lychee locally? Because without any changes it generates this:
🔍 4734 Total (in 1m 3s) ✅ 4317 OK 🚫 410 Errors 👻 6 Excluded ⏳ 1 Timeouts
It also will require a GITHUB_TOKEN otherwise we run into rate limits.
903c337
to
2c79ee8
Compare
.github/workflows/verify-links.yml
Outdated
token: ${{ secrets.GITHUB_TOKEN }} | ||
|
||
- name: Find the last open report issue | ||
if: steps.lychee.outputs.exit_code != 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The summary page shows that it found some errors - can you please explain when will lychee fail and make an issue to notify us and what kind of problems?
(Maybe I'm just missing the point of link checkers)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Cockpit we have tools/urls-check
which creates an issue if the link check fails. Thats what this PR attempts to also do.
We generally haven't used much external actions so I am wondering if we want too.
id: lychee | ||
uses: lycheeverse/lychee-action@v2 | ||
with: | ||
fail: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I understand right this does not affect the following step and its check if: steps.lychee.outputs.exit_code != 0
.github/workflows/verify-links.yml
Outdated
--base . | ||
--skip-missing | ||
--exclude-all-private | ||
--exclude '^https://linux.die.net' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are those left out, anubis?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't. Anubis doesn't block curl or other user-agents, only the ones that masquerade themselves as "human" browsers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC it was 403 forbidden. Likely due to detecting it's from GitHub and just flatout declining it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yaml allows comments so would be good to add a reason
Alright, how do we want to proceed here? There are still tons of issues with urls, I think merging this without the issue create/update is fine for now but then urls need to be fixed. |
Verifies that links within the project are alive and doesn't go to outdated or otherwise dead links. Signed-off-by: Freya Gustavsson <freya@venefilyn.se>
94a943a
to
4ac9a18
Compare
e9685c3
to
f695b34
Compare
Signed-off-by: Freya Gustavsson <freya@venefilyn.se>
4d690ce
to
29b0609
Compare
Signed-off-by: Freya Gustavsson <freya@venefilyn.se>
29b0609
to
9ab36d0
Compare
@jelly I updated the code to be a bit better. PTAL at the output and let me know if there is anything specific that stands out that needs to be changed with config 🙏 https://github.com/cockpit-project/cockpit-project.github.io/actions/runs/18015821705?pr=819 |
- name: Find the last open report issue | ||
if: steps.lychee.outputs.exit_code != 0 | ||
id: last-issue | ||
uses: micalevisk/last-issue-action@v2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one seems to be dead for 2 years without releases and I am slightly concerned about supply chain attacks here.
It essentially does one query which seems to be trivially rewritten in JavaScript in the workflow like we do for dependabot
:
try {
await github.rest.issues.removeLabel({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
name: 'node_modules'
});
} catch (e) {
if (e.name == 'HttpError' && e.status == 404) {
/* expected: 404 if label is unset */
} else {
throw e;
}
}
But then github.rest.issues
using actions/github-script
Getting the issue might be as simple as octokit.rest.issues.listForRepo({ labels: ["link-checker"] })
and we always create an urls-check issue with a fixed label.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use the hash for taking the release and update it with Renovate when there are new versions (if ever). Would at least ensure that the release itself is the one we expect.
uses: micalevisk/last-issue-action@v2 | |
uses: micalevisk/last-issue-action@044e1cb7e9a4dde20e22969cb67818bfca0797be # v2 |
But yeah might be better to use something like that if it works
with: | ||
title: Link Checker Report | ||
content-filepath: ./lychee/out.md | ||
issue-number: ${{ steps.last-issue.outputs.issue-number }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is no issue is found? Does this accept null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is an optional field so yeah it would parse as not having provided any.
In this repo I see that they were successful with this approach:
tldr-pages/tldr-maintenance#129
Given their workflow:
https://github.com/tldr-pages/tldr-maintenance/blob/main/.github/workflows/check-links.yml
So it should work for us too. Their approach is to keep the issue open and update it regardless if the link checker failed or not. Whereas in this PR I've set it so it opens an issue or updates an existing one if something fails. So there is a slight difference.
They also have a step to close the issue which we could adopt (though theirs doesn't work atm due to their if-statement it seems).
I really like the idea of link checking but I see that this PR now also pushes an artifact of the build website and then in another action check the links? Why can't we do this in a normal git checkout? |
Splitting it out makes it cleaner as we just have to build once. The artifact should be in GitHub runner's cache too IIRC so basically takes no time for it to "download" the artifact. Otherwise we'd have to combine build and publish into one which isn't as clean, and we have to add the same build step for link checker as well. We can't check our source as-is because we construct URLs and pages on build-time. Like for categories, authors, anything internal with |
Verifies links through Lychee within the project are alive and doesn't go to
outdated or otherwise dead links.