-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Redirects: new type to support URLs without trailing slash #12323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This PR adds a new redirect type: `CLEAN_URL_WITHOUT_TRAILING_SLASH_TO_HTML`. - `/en/latest/tabs` -> `/en/latest/tabs.html` - `/en/stable/guides/myst` -> `/en/stable/guides/myst.html` This should be useful for all those static frameworks that generates URLs without trailing slashes. The redirect has to be created with `force=True` to support these cases. Closes #11220
The custom redirect wasn't working originally because it was not marked as `force=True`. We can revert this change now.
I really don't love adding new types of redirects -- I feel like the whole concept is a bit of a hack, and we should really be trying to get to a place where people can define these patterns easily with regex or similar. This just adds a lot of complexity to every user who ever creates a redirect, where they have to see this type, which they likely don't want. I don't know if there's a better solution in the short term, but I really don't love adding more complexity here if we can avoid it 🙃 |
Interesting. My position is the opposite here: adding a new redirect makes the feature to be completely isolated and only affect users that create this redirect. It doesn't change any behavior for anybody else. This feature is comparable with the other two redirects we have for Sphinx (migrating from/to dirhtml builder: I'm not opposed on discussing about other more generic ways to refactor/re-structure redirects, but that sounds a lot more work and a medium term goal. Jumping into that now would be a distraction from our current goals and won't add too much value immediately. |
This just feels like a really weird concept. I don't know of any HTTP concept where It seems like what we really want is a way to always ensure we're ending URLs with a |
Yes, I agree, and I don't like it either -- but that's how a lot of js SSG framework work and some hosting too (eg. Netlify), unfortunately. We need to give those users a solution without breaking all our current behavior.
If we add a trailing |
Yes, I'm not suggesting we do it by default, but that should be the option we implement, not redirecting Overall I'd be much happier having |
This is the case this PR implements as a user redirect they can enable by project. This is what Netlify, GitHub Pages and Vercel do, so I think it makes sense we also support it. I checked with AI, and it seems this is not specified in the RFC and it's an implementation detail in the server. These hosting providers try to hide the
This is not supported and it should not work like that. |
It's worth to note that Jupyter Book has implemented an option to enable creating However, Jupyter Book without this option won't work on Read the Docs. They same happens today with Docusaurus when I think implementing this feature is important to support more documentation tools as user expect to be supported and avoid them migrating to Netlify/GitHub Pages/Vercel due to this -- thinking that Read the Docs doesn't work for their documentation tool. |
(Vercel documentation that explains this: https://vercel.com/docs/project-configuration#cleanurls) |
The problem is that we have to explain to these users somehow that we support this, and recommend that they enable it. We already can barely explain the features that we have, so it seems unlikely that we'll be able to communicate this effectively. I think it's fine to implement the backend, but we need additional documentation and hopefully suggestions in the UI eventually. |
if not ext: | ||
to = name + ".html" | ||
return self.get_full_path( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still feels weird to me. It will redirect /foo
to /foo.html
, even if /foo.html
doesn't exist. I thought the other implementations here only redirected if this file existed? It seems OK, but very unexpected as a user and maintainer :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's how redirect works, right? If the user defines any redirect we don't check the to URL
exists before redirecting. However, we do check if the from URL
exists before redirecting, unless force=True
is defined.
https://github.com/readthedocs/readthedocs.org/blob/main/readthedocs/redirects/models.py#L310-L335
HTML_TO_CLEAN_URL_REDIRECT = "html_to_clean_url" | ||
|
||
TYPE_CHOICES = ( | ||
(PAGE_REDIRECT, _("Page Redirect")), | ||
(EXACT_REDIRECT, _("Exact Redirect")), | ||
(CLEAN_URL_TO_HTML_REDIRECT, _("Clean URL to HTML (file/ to file.html)")), | ||
( | ||
CLEAN_URL_WITHOUT_TRAILING_SLASH_TO_HTML_REDIRECT, | ||
_("Clean URL, without trailing slash, to HTML (file to file.html)"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that users who want this, want the other behavior (Clean URLs in this case is `/foo) -- so this feels like an even weirder default thing to support. It seems like we should just tell them how to configure their tools to output the right URLs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Clean URLs" is a weird definition and I don't think there is an standard for it 😄
In the Sphinx world it means using /page/
instead of page.html
. For Vercel, it means the URL won't have the extension of the filename, and it could be with or without the trailing slash. They differentiate these two things explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea but it always means without the .html extension :) hard to explain without a bunch of text tho.
100% agreed with this. We need more documentation around this particular redirect, but also around what are the framework that require this setting and also suggested settings for frameworks that support working as Read the Docs expects by default. |
My thoughts: We are asking users to create a redirect to support those type of projects, that's probably the same level of documentation than telling them to just configure their tool so it outputs links correctly. But if users still want to have URLs without the HTML extension, what they probably want is for us to serve those URLs as is, not redirect. GH pages also serves pages without an extension out of the box. |
Unfortunately, this is not possible in all the frameworks. That config may not exist there --this was the case of mystmd.
Yeah, ideally, we should support this at proxito server, but that would require a lot more work there and may introduce backward incompatibility changes. How do you keep the current behavior This PR is a middle ground solution for now to allow these framework to work on Read the Docs, at least until we can jump pretty deep into proxito server and implement the ideal solution.
Yes, Netlify and Vercel as well. Footnotes
|
This PR adds a new redirect type:
CLEAN_URL_WITHOUT_TRAILING_SLASH_TO_HTML
./en/latest/tabs
->/en/latest/tabs.html
/en/stable/guides/myst
->/en/stable/guides/myst.html
This should be useful for all those static frameworks that generates URLs
without trailing slashes. The redirect has to be created with
force=True
tosupport these cases.
Closes #11220