Reshape codec #10

jbms · 2025-04-30T18:03:25Z

No description provided.

jbms · 2025-04-30T19:10:11Z

@ldeakan this is an alternative to the previously discussed squeeze that can accomplish the same thing but is more flexible. What do you think?

It should provide a lot of flexibility for using image codecs and zfp with zarr arrays containing an arbitrary number of dimensions.

normanrz · 2025-05-05T16:19:45Z

Tagging @LDeakin again, because the linking didn't seem to work.

LDeakin · 2025-05-06T03:24:24Z

It looks pretty reasonable to me, but I could give you a more definitive review if I get around to implementing it. @jbms do you intend to support this in tensorstore / neuroglancer?

@normanrz Should extensions have an implementation before they get merged in this repo?

jbms · 2025-05-06T04:03:25Z

I do intend to implement it in tensorstore and neuroglancer. I've started the tensorstore implementation.

There are some tradeoffs I made in designing this:

Allowing dimensions to be specified indirectly (i.e. as the product of one or more input dimensions, or -1 to mean all remaining) is critical to allow it to work with variable chunking. It adds a bit of complexity in resolving the shape but it doesn't affect anything else.
I think it will often be useful to combine this with transpose. If all elements of the shape had to be specified as an array of input dims (no explicit size or -1) then it would be natural to allow this codec to also perform transposing. But we already have transpose for transposing and then it wouldn't support cases that require an explicit size for some dimension (though I can't really think of a lot of clear use cases for that).
For implementations like zarrs that always use a fixed c/Fortran memory layout non-partial encoding and decoding should be trivial. On the other hand, when combined with transpose there may be unnecessary copying. When allowing arbitrary strided layouts as in tensorstore (and I think zarr-python), the non-partial encoding and decoding is more complicated -- some cases can be handled without copying, some cases can't.

jbms · 2025-05-06T04:06:12Z

I will indeed implement it before switching to non-draft but also wanted to get feedback on it.

normanrz · 2025-05-06T17:19:37Z

@normanrz Should extensions have an implementation before they get merged in this repo?

There is no strict requirement to have an implementation when registering the name. It is recommended, though.

d-v-b · 2025-05-16T09:53:19Z

codecs/reshape/README.md

+  `prod(B_shape[:i]) == prod(A_shape[input_dims[0]])` and
+  `prod(B_shape[i+1:]) == prod(A_shape[input_dims[k-1]+1:])`.
+
+This two constraints ensure that if the size of output dimension `i` is


Suggested change

This two constraints ensure that if the size of output dimension `i` is

These two constraints ensure that if the size of output dimension `i` is

d-v-b · 2025-05-16T09:54:23Z

codecs/reshape/README.md

+For example, the array metadata below specifies that the compressor is the Zstd
+codec configured with a compression level of 1 and with the checksum stored:


this description does not match the JSON

d-v-b · 2025-05-16T09:54:41Z

codecs/reshape/README.md

+specified above.
+
+As this codec does NOT alter the lexicographical order of elements, the contents
+of the output array `B` is related to the contents of the input array `A` by:


Suggested change

of the output array `B` is related to the contents of the input array `A` by:

of the output array `B` are related to the contents of the input array `A` by:

Add reshape codec

7295bf1

jbms force-pushed the reshape-codec branch from d79dca5 to 7295bf1 Compare April 30, 2025 19:08

jbms marked this pull request as draft April 30, 2025 19:10

d-v-b reviewed May 16, 2025

View reviewed changes

d-v-b mentioned this pull request May 16, 2025

image codecs #15

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reshape codec #10

Reshape codec #10

jbms commented Apr 30, 2025

jbms commented Apr 30, 2025

normanrz commented May 5, 2025

LDeakin commented May 6, 2025

jbms commented May 6, 2025

jbms commented May 6, 2025

normanrz commented May 6, 2025 •

edited

Loading

d-v-b May 16, 2025

d-v-b May 16, 2025

d-v-b May 16, 2025

	This two constraints ensure that if the size of output dimension `i` is
	These two constraints ensure that if the size of output dimension `i` is

		For example, the array metadata below specifies that the compressor is the Zstd
		codec configured with a compression level of 1 and with the checksum stored:

	of the output array `B` is related to the contents of the input array `A` by:
	of the output array `B` are related to the contents of the input array `A` by:

Reshape codec #10

Are you sure you want to change the base?

Reshape codec #10

Conversation

jbms commented Apr 30, 2025

jbms commented Apr 30, 2025

normanrz commented May 5, 2025

LDeakin commented May 6, 2025

jbms commented May 6, 2025

jbms commented May 6, 2025

normanrz commented May 6, 2025 • edited Loading

d-v-b May 16, 2025

Choose a reason for hiding this comment

d-v-b May 16, 2025

Choose a reason for hiding this comment

d-v-b May 16, 2025

Choose a reason for hiding this comment

normanrz commented May 6, 2025 •

edited

Loading