-
-
Notifications
You must be signed in to change notification settings - Fork 134
enhance: improve media type recognition with HEAD or magic bytes #2599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
const IMGPROXY_URL = process.env.IMGPROXY_URL_DOCKER || process.env.NEXT_PUBLIC_IMGPROXY_URL | ||
const IMGPROXY_SALT = process.env.IMGPROXY_SALT | ||
const IMGPROXY_KEY = process.env.IMGPROXY_KEY | ||
const MEDIA_CHECK_URL = process.env.MEDIA_CHECK_URL_DOCKER || process.env.NEXT_PUBLIC_MEDIA_CHECK_URL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Media Type URL Fetch Fails Without Env Vars
The new media type checking mechanism relies on environment variables for its URL. If these are unset, fetch
calls are made to invalid undefined/...
URLs, causing TypeError
or fetch failures. This prevents media from being correctly identified, regressing from the previous self-contained implementation.
Additional Locations (2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well... yeah, everything would show as links.
I was wondering if a fallback to the traditional system might be acceptable considering the dislocation of the endpoint to another service.
Description
fixes #2341
fixes #1433
can address #850 for images
We check if a link is a media file by downloading it fully, and then we download it again when we want to render it.
This PR improves media type recognition on links by fetching
HEAD
or, as a fallback, the first magic bytes via magic-bytes.js.Client and imgproxy can use an endpoint placed in the capture micro service (avoids CORS) to know if they're dealing with an image or a video.
Also checks if a link has HTTP Basic Auth.
Screenshots
Loading an
image/avif
file from the browser to check and render media vs checking the magic bytesProof of concept
Screen.Recording.2025-10-11.at.17.19.03.mp4
Additional Context
The endpoint lives in the capture micro service, because of this, the compose profile must have "capture".
Maybe we can add an extra fallback for when the capture instances go offline, if ever
We can get rid of the
HEAD
fetch if there's the possibility of false informations, it's just a cheap way to getContent-Type
The first magic bytes of an image can also contain informations about dimensions, and we could use it to avoid render jumps before imgproxy takes over. It's not implemented in this PR
This job could also be done in-house but we would deal with lots of magic numbers this way, so a popular and well-maintained library seemed a better idea.
This doesn't get rid of the heuristics involved in the imgproxy worker, it's still something that we know for sure that it works. But it's definitely redundant now.
Checklist
Are your changes backward compatible? Please answer below:
For example, a change is not backward compatible if you removed a GraphQL field or dropped a database column.
Yes
On a scale of 1-10 how well and how have you QA'd this change and any features it might affect? Please answer below:
7, pretty good actually
For frontend changes: Tested on mobile, light and dark mode? Please answer below:
n/a
Did you introduce any new environment variables? If so, call them out explicitly here:
The following env vars have been introduced
The last one has been introduced in
.env.production
too but I don't think that file is even usedDid you use AI for this? If so, how much did it assist you?
The readFirstBytes function is partially vibed, there were some things not really clear to me in that moment about the part of reading the small chunk with Reader, so it came in help.
Note
Adds a capture service endpoint to detect image/video via HEAD or magic-bytes and wires it into the frontend and imgproxy worker with new env vars.
media-check
endpoint (capture/media-check.js
) usingHEAD
and magic bytes (magic-bytes.js
) with timeout/byte limits and basic-auth handling.capture/index.js
at/${MEDIA_CHECK_ROUTE}/:url
.magic-bytes.js
.worker/imgproxy.js
):MEDIA_CHECK_URL
endpoint; cache result.components/media-or-link.js
):PUBLIC_MEDIA_CHECK_URL
to setisImage
/isVideo
.MEDIA_CHECK_ROUTE
,MEDIA_CHECK_URL_DOCKER
,NEXT_PUBLIC_MEDIA_CHECK_URL
(added to.env.development
and.env.production
).process.env.NEXT_PUBLIC_MEDIA_CHECK_URL
vianext.config.js
DefinePlugin; exportPUBLIC_MEDIA_CHECK_URL
inlib/constants.js
.README.md
example to includecapture
inCOMPOSE_PROFILES
.Written by Cursor Bugbot for commit 8c3e6d1. This will update automatically on new commits. Configure here.