fix(body): skip empty chunks during gzip decoding #480

acme · 2025-05-27T11:04:59Z

Process more chunks if we receive just the gzip header

Addresses #479

- Process more chunks if we receive just the gzip header Fixes #479

cceckman-at-fastly · 2025-05-27T19:24:42Z

The fix looks reasonable, though let me make sure I'm following the problem correctly:

The goal is to never get a chunk with zero bytes. (Is that right? It seems like there are still cases where a zero-byte chunk may be produced, e.g. one could be passed through from downstream.)

The problem in this case is that, if the gzip header and body are in different chunks during Viceroy's processing, the header will be consumed -> update the state, but there are no bytes "ready" -> empty decompressed chunk. The next chunk then would (could) contain the decompressed data.

It would be good to have some tests to capture the behavior we're worried about & any regressions - I've started sketching some proptest tests for these cases, will ping you to go over them.

acme · 2025-05-28T06:36:23Z

Thanks for investigating. I'm not familiar with the Viceroy codebase, but this change certainly fixes #479 and let me provide a working solution for a customer.

cceckman-at-fastly · 2025-05-28T17:09:07Z

I added some additional investigation on #479 ; based on my testing, I think Viceroy is still vulnerable to a related issue in some cases (if the origin feeds in a zero-length chunk).

That said, yes, this makes the implementation more resilient & fixes an issue for the customer, so let's get it in; I'll follow up on the broader issue.

This demonstrates the underlying issue in #479. The origin writes maliciously-sized chunks - zero-length for the uncompressed body, two bytes at a time for the compressed body - and the test fixture asserts that the expected data are read. This passes for the uncompressed body (yay!); without #480, this fails for the uncompressed body (i.e. it's a reproducer for #479).

A more general set of tests for #479. Assume that the origin can send chunks of arbitrary length - in the case of #479, a gzip header in a separate chunk from the content. These show up as separate chunks in a `hyper::Body`. Ensure that, when splitting a body at arbitrary chunk lengths, the output still reproduces the body. This causes failures without #480 (or similar changes). This validates the fix in #480 (the gzip test fails without #480).

Tests and a slightly broader preventative measure for #479. The Rust [`Read`](https://doc.rust-lang.org/std/io/trait.Read.html) trait uses a return value of `Ok(0)` to indicate a (permanent/temporary) end-of-stream, e.g. end-of-file. We use the same convention for the `fastly_http_body::read` hostcall. The `Body` [trait](https://docs.rs/http-body/latest/http_body/trait.Body.html) is a little more careful, with a separate `is_stream_end` method distinct from the "length of frame" property. However, this means if our `Body` implementations _ever_ yield a zero-length chunk, the `fastly_http_body::read` hostcall will look like an end-of-stream to the guest. That's at the root of #479. #480 addresses the observed cause- that the gzip decoder in particular can yield zero-length chunks when the header + body are split. However, it's in principle possible for another (current or future) variant of `Body` to reintroduce the issue. - Test for zero-length bodies, as an integraton test and as a property test. If #480 is patched out, both of these groups of tests reproduce the issue with the gzip body. - Tackle the issue where it occurs, at the conversion from the `http::Body` stream-style API to the `std::io::Read`-style API. Add a `Body::read` method that acts more like the `AsyncRead` traits; use it from the hostcalls, instead of putting `Body` internals in those hostcalls. Fixed #479 (to my satisfaction)

fix(body): skip empty chunks during gzip decoding

64738ca

- Process more chunks if we receive just the gzip header Fixes #479

cceckman-at-fastly self-requested a review May 27, 2025 19:24

cceckman-at-fastly approved these changes May 28, 2025

View reviewed changes

cceckman-at-fastly merged commit c8c0ef2 into main May 28, 2025
15 checks passed

cceckman-at-fastly deleted the acme/skip-empty-chunks-during-gzip-decoding branch May 28, 2025 17:09

cceckman-at-fastly mentioned this pull request May 28, 2025

Ensure body reads only return 0-length at the end #483

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(body): skip empty chunks during gzip decoding #480

fix(body): skip empty chunks during gzip decoding #480

Uh oh!

acme commented May 27, 2025 •

edited by cceckman-at-fastly

Loading

Uh oh!

cceckman-at-fastly commented May 27, 2025

Uh oh!

acme commented May 28, 2025

Uh oh!

cceckman-at-fastly commented May 28, 2025

Uh oh!

Uh oh!

Uh oh!

fix(body): skip empty chunks during gzip decoding #480

fix(body): skip empty chunks during gzip decoding #480

Uh oh!

Conversation

acme commented May 27, 2025 • edited by cceckman-at-fastly Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cceckman-at-fastly commented May 27, 2025

Uh oh!

acme commented May 28, 2025

Uh oh!

cceckman-at-fastly commented May 28, 2025

Uh oh!

Uh oh!

Uh oh!

acme commented May 27, 2025 •

edited by cceckman-at-fastly

Loading