Skip to content

Redirects: new type to support URLs without trailing slash #12323

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

humitos
Copy link
Member

@humitos humitos commented Jul 15, 2025

This PR adds a new redirect type: CLEAN_URL_WITHOUT_TRAILING_SLASH_TO_HTML.

  • /en/latest/tabs -> /en/latest/tabs.html
  • /en/stable/guides/myst -> /en/stable/guides/myst.html

This should be useful for all those static frameworks that generates URLs
without trailing slashes. The redirect has to be created with force=True to
support these cases.

▶ curl -sI http://test-builds.devthedocs.org/en/latest/tabs | grep -i location
Location: http://test-builds.devthedocs.org/en/latest/tabs.html

▶ curl -sI http://test-builds.devthedocs.org/en/full-feature/tabs | grep -i location
Location: http://test-builds.devthedocs.org/en/full-feature/tabs.html
                                                                                                                                                                                           
▶ curl -sI http://test-builds.devthedocs.org/en/latest/guides/quickstart | grep -i location
Location: http://test-builds.devthedocs.org/en/latest/guides/quickstart.html

Closes #11220

humitos added 2 commits July 15, 2025 14:34
This PR adds a new redirect type: `CLEAN_URL_WITHOUT_TRAILING_SLASH_TO_HTML`.

- `/en/latest/tabs` -> `/en/latest/tabs.html`
- `/en/stable/guides/myst` -> `/en/stable/guides/myst.html`

This should be useful for all those static frameworks that generates URLs
without trailing slashes. The redirect has to be created with `force=True` to
support these cases.

Closes #11220
@humitos humitos requested a review from a team as a code owner July 15, 2025 12:52
@humitos humitos requested a review from ericholscher July 15, 2025 12:52
The custom redirect wasn't working originally because it was not marked as
`force=True`. We can revert this change now.
@ericholscher
Copy link
Member

I really don't love adding new types of redirects -- I feel like the whole concept is a bit of a hack, and we should really be trying to get to a place where people can define these patterns easily with regex or similar. This just adds a lot of complexity to every user who ever creates a redirect, where they have to see this type, which they likely don't want. I don't know if there's a better solution in the short term, but I really don't love adding more complexity here if we can avoid it 🙃

@humitos
Copy link
Member Author

humitos commented Jul 16, 2025

Interesting. My position is the opposite here: adding a new redirect makes the feature to be completely isolated and only affect users that create this redirect. It doesn't change any behavior for anybody else. This feature is comparable with the other two redirects we have for Sphinx (migrating from/to dirhtml builder: file/ -> file.html and file.html -> file/). In my opinion this is a quick and simple solution that follows the pattern we have currently.

I'm not opposed on discussing about other more generic ways to refactor/re-structure redirects, but that sounds a lot more work and a medium term goal. Jumping into that now would be a distraction from our current goals and won't add too much value immediately.

@ericholscher
Copy link
Member

ericholscher commented Jul 16, 2025

This just feels like a really weird concept. I don't know of any HTTP concept where /foo should mean foo.html instead of /foo/index.html?

It seems like what we really want is a way to always ensure we're ending URLs with a /, and if the URL doesn't include a /, then we add it?

@humitos
Copy link
Member Author

humitos commented Jul 16, 2025

This just feels like a really weird concept. I don't know of any HTTP concept where /foo should mean foo.html instead of /foo/index.html?

Yes, I agree, and I don't like it either -- but that's how a lot of js SSG framework work and some hosting too (eg. Netlify), unfortunately. We need to give those users a solution without breaking all our current behavior.

It seems like what we really want is a way to always ensure we're ending URLs with a /, and if the URL doesn't include a /, then we add it?

If we add a trailing / we will be doing /foo -> /foo/ -> foo/index.html with the current Proxito code. Changing that behavior will have a lot more unexpected impact and that's why I didn't go in that direction.

@ericholscher
Copy link
Member

It seems like what we really want is a way to always ensure we're ending URLs with a /, and if the URL doesn't include a /, then we add it?

If we add a trailing / we will be doing /foo -> /foo/ -> foo/index.html with the current Proxito code. Changing that behavior will have a lot more unexpected impact and that's why I didn't go in that direction.

Yes, I'm not suggesting we do it by default, but that should be the option we implement, not redirecting /foo to /foo.html. That just feels weird and broken to me, but I must just be old?

Overall I'd be much happier having /foo to /foo.html be a special case that we support, and then users have to define a /foo/ to foo.html redirect on their own, but maybe that's just more complicated? The whole thing just makes me sad.

@humitos
Copy link
Member Author

humitos commented Jul 21, 2025

Overall I'd be much happier having /foo to /foo.html be a special case that we support

This is the case this PR implements as a user redirect they can enable by project. This is what Netlify, GitHub Pages and Vercel do, so I think it makes sense we also support it.

I checked with AI, and it seems this is not specified in the RFC and it's an implementation detail in the server. These hosting providers try to hide the .html when they can and call them "Clean URLs" / "Pretty URLs".

and then users have to define a /foo/ to foo.html redirect on their own, but maybe that's just more complicated?

This is not supported and it should not work like that. /foo/ should always serve /foo/index.html.

@humitos
Copy link
Member Author

humitos commented Jul 21, 2025

It's worth to note that Jupyter Book has implemented an option to enable creating /index.html files, so they work by default on Read the Docs: jupyter-book/mystmd#2178

However, Jupyter Book without this option won't work on Read the Docs. They same happens today with Docusaurus when trailingSlash: false is used.

I think implementing this feature is important to support more documentation tools as user expect to be supported and avoid them migrating to Netlify/GitHub Pages/Vercel due to this -- thinking that Read the Docs doesn't work for their documentation tool.

@humitos
Copy link
Member Author

humitos commented Jul 21, 2025

For example, a static file named about.html will be served when visiting the /about path. Visiting /about.html will redirect to /about.

(Vercel documentation that explains this: https://vercel.com/docs/project-configuration#cleanurls)

@ericholscher
Copy link
Member

The problem is that we have to explain to these users somehow that we support this, and recommend that they enable it. We already can barely explain the features that we have, so it seems unlikely that we'll be able to communicate this effectively. I think it's fine to implement the backend, but we need additional documentation and hopefully suggestions in the UI eventually.

Comment on lines +335 to +337
if not ext:
to = name + ".html"
return self.get_full_path(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still feels weird to me. It will redirect /foo to /foo.html, even if /foo.html doesn't exist. I thought the other implementations here only redirected if this file existed? It seems OK, but very unexpected as a user and maintainer :/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's how redirect works, right? If the user defines any redirect we don't check the to URL exists before redirecting. However, we do check if the from URL exists before redirecting, unless force=True is defined.

https://github.com/readthedocs/readthedocs.org/blob/main/readthedocs/redirects/models.py#L310-L335

HTML_TO_CLEAN_URL_REDIRECT = "html_to_clean_url"

TYPE_CHOICES = (
(PAGE_REDIRECT, _("Page Redirect")),
(EXACT_REDIRECT, _("Exact Redirect")),
(CLEAN_URL_TO_HTML_REDIRECT, _("Clean URL to HTML (file/ to file.html)")),
(
CLEAN_URL_WITHOUT_TRAILING_SLASH_TO_HTML_REDIRECT,
_("Clean URL, without trailing slash, to HTML (file to file.html)"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that users who want this, want the other behavior (Clean URLs in this case is `/foo) -- so this feels like an even weirder default thing to support. It seems like we should just tell them how to configure their tools to output the right URLs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Clean URLs" is a weird definition and I don't think there is an standard for it 😄

In the Sphinx world it means using /page/ instead of page.html. For Vercel, it means the URL won't have the extension of the filename, and it could be with or without the trailing slash. They differentiate these two things explicitly.

Copy link
Member

@ericholscher ericholscher Jul 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea but it always means without the .html extension :) hard to explain without a bunch of text tho.

@humitos
Copy link
Member Author

humitos commented Jul 22, 2025

I think it's fine to implement the backend, but we need additional documentation and hopefully suggestions in the UI eventually.

100% agreed with this. We need more documentation around this particular redirect, but also around what are the framework that require this setting and also suggested settings for frameworks that support working as Read the Docs expects by default.

@stsewd
Copy link
Member

stsewd commented Jul 23, 2025

My thoughts: We are asking users to create a redirect to support those type of projects, that's probably the same level of documentation than telling them to just configure their tool so it outputs links correctly. But if users still want to have URLs without the HTML extension, what they probably want is for us to serve those URLs as is, not redirect. GH pages also serves pages without an extension out of the box.

@humitos
Copy link
Member Author

humitos commented Jul 23, 2025

@stsewd

We are asking users to create a redirect to support those type of projects, that's probably the same level of documentation than telling them to just configure their tool so it outputs links correctly

Unfortunately, this is not possible in all the frameworks. That config may not exist there --this was the case of mystmd.

But if users still want to have URLs without the HTML extension, what they probably want is for us to serve those URLs as is, not redirect.

Yeah, ideally, we should support this at proxito server, but that would require a lot more work there and may introduce backward incompatibility changes. How do you keep the current behavior /foo -> /foo/ 1 but also accept the URL /foo to serve the file /foo.html?

This PR is a middle ground solution for now to allow these framework to work on Read the Docs, at least until we can jump pretty deep into proxito server and implement the ideal solution.

GH pages also serves pages without an extension out of the box.

Yes, Netlify and Vercel as well.

Footnotes

  1. Example of trailing slash redirect: curl -sIL https://docs.readthedocs.com/platform/stable/config-file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

redirect: Clean file URLs as well
3 participants