Skip to content

Conversation

Soxasora
Copy link
Member

@Soxasora Soxasora commented Oct 11, 2025

Description

fixes #2341
fixes #1433
can address #850 for images

We check if a link is a media file by downloading it fully, and then we download it again when we want to render it.
This PR improves media type recognition on links by fetching HEAD or, as a fallback, the first magic bytes via magic-bytes.js.

Client and imgproxy can use an endpoint placed in the capture micro service (avoids CORS) to know if they're dealing with an image or a video.

Also checks if a link has HTTP Basic Auth.

Screenshots

Loading an image/avif file from the browser to check and render media vs checking the magic bytes
Proof of concept

Screen.Recording.2025-10-11.at.17.19.03.mp4

Additional Context

The endpoint lives in the capture micro service, because of this, the compose profile must have "capture".
Maybe we can add an extra fallback for when the capture instances go offline, if ever


We can get rid of the HEAD fetch if there's the possibility of false informations, it's just a cheap way to get Content-Type


Can address #850 for images

The first magic bytes of an image can also contain informations about dimensions, and we could use it to avoid render jumps before imgproxy takes over. It's not implemented in this PR


This job could also be done in-house but we would deal with lots of magic numbers this way, so a popular and well-maintained library seemed a better idea.


This doesn't get rid of the heuristics involved in the imgproxy worker, it's still something that we know for sure that it works. But it's definitely redundant now.

Checklist

Are your changes backward compatible? Please answer below:

For example, a change is not backward compatible if you removed a GraphQL field or dropped a database column.
Yes

On a scale of 1-10 how well and how have you QA'd this change and any features it might affect? Please answer below:
7, pretty good actually

For frontend changes: Tested on mobile, light and dark mode? Please answer below:
n/a

Did you introduce any new environment variables? If so, call them out explicitly here:
The following env vars have been introduced

MEDIA_CHECK_ROUTE=media
-- route for the capture micro service
MEDIA_CHECK_URL_DOCKER=http://capture:5678/media
-- url for imgproxy, communication between containers
NEXT_PUBLIC_MEDIA_CHECK_URL=http://localhost:5678/media
-- url for client-side fetches, e.g. media-or-link.js

The last one has been introduced in .env.production too but I don't think that file is even used

Did you use AI for this? If so, how much did it assist you?
The readFirstBytes function is partially vibed, there were some things not really clear to me in that moment about the part of reading the small chunk with Reader, so it came in help.


Note

Adds a capture service endpoint to detect image/video via HEAD or magic-bytes and wires it into the frontend and imgproxy worker with new env vars.

  • Capture service:
    • Add media-check endpoint (capture/media-check.js) using HEAD and magic bytes (magic-bytes.js) with timeout/byte limits and basic-auth handling.
    • Wire route in capture/index.js at /${MEDIA_CHECK_ROUTE}/:url.
    • Add dependency magic-bytes.js.
  • Worker (worker/imgproxy.js):
    • Replace ad-hoc HEAD/GET detection with call to MEDIA_CHECK_URL endpoint; cache result.
  • Frontend (components/media-or-link.js):
    • Replace video/img probe hack with fetch to PUBLIC_MEDIA_CHECK_URL to set isImage/isVideo.
  • Config/Env:
    • New envs: MEDIA_CHECK_ROUTE, MEDIA_CHECK_URL_DOCKER, NEXT_PUBLIC_MEDIA_CHECK_URL (added to .env.development and .env.production).
    • Expose process.env.NEXT_PUBLIC_MEDIA_CHECK_URL via next.config.js DefinePlugin; export PUBLIC_MEDIA_CHECK_URL in lib/constants.js.
  • Docs:
    • Update README.md example to include capture in COMPOSE_PROFILES.

Written by Cursor Bugbot for commit 8c3e6d1. This will update automatically on new commits. Configure here.

Copy link

socket-security bot commented Oct 11, 2025

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedmagic-bytes.js@​1.12.110010010080100

View full report

@Soxasora Soxasora marked this pull request as ready for review October 11, 2025 18:15
cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

@Soxasora Soxasora marked this pull request as draft October 12, 2025 09:25
@Soxasora Soxasora marked this pull request as ready for review October 12, 2025 16:24
cursor[bot]

This comment was marked as outdated.

const IMGPROXY_URL = process.env.IMGPROXY_URL_DOCKER || process.env.NEXT_PUBLIC_IMGPROXY_URL
const IMGPROXY_SALT = process.env.IMGPROXY_SALT
const IMGPROXY_KEY = process.env.IMGPROXY_KEY
const MEDIA_CHECK_URL = process.env.MEDIA_CHECK_URL_DOCKER || process.env.NEXT_PUBLIC_MEDIA_CHECK_URL
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Media Type URL Fetch Fails Without Env Vars

The new media type checking mechanism relies on environment variables for its URL. If these are unset, fetch calls are made to invalid undefined/... URLs, causing TypeError or fetch failures. This prevents media from being correctly identified, regressing from the previous self-contained implementation.

Additional Locations (2)

Fix in Cursor Fix in Web

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well... yeah, everything would show as links.
I was wondering if a fallback to the traditional system might be acceptable considering the dislocation of the endpoint to another service.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Attempt to render links as media triggers HTTP Basic Auth AVIF images are rendered as video

1 participant