image codecs #15

d-v-b · 2025-05-16T10:02:21Z

imagecodecs defines a trove of useful image compression / decompression routines that are accessible via the numcodecs API.

I think we should consider some or all of these codecs high priority for additions to this repo.

As many image codecs impose strict constraints on the dimensionality of their inputs, we will likely need something like the transpose codec to efficiently transform N-dimensional arrays into a shape consistent with a given codec. I think the forthcoming reshape codec will be of great use here. But there might be other codecs / array transformations that are useful for using general purpose image codecs in a zarr context.

My goal with this issue is to define some priorities (e.g., whatever we need for reading the most widespread archival TIFF data should probably be prioritized), and to ensure that we are all working in a coordinated fashion.

cc'ing people who I know are interested in this development so we can coordinate our efforts. This list is not exhaustive, please feel free to bring additional people into the conversation.

@maxrjones, @mkitti, @jbms

The text was updated successfully, but these errors were encountered:

maxrjones · 2025-05-16T13:17:18Z

As many image codecs impose strict constraints on the dimensionality of their inputs, we will likely need something like the transpose codec to efficiently transform N-dimensional arrays into a shape consistent with a given codec. I think the forthcoming #10 will be of great use here. But there might be other codecs / array transformations that are useful for using general purpose image codecs in a zarr context.

There are a couple imagecodecs ArrayArrayCodecs where this would matter, but we could start with the BytesBytesCodecs which are independent of the array shape.

My goal with this issue is to define some priorities (e.g., whatever we need for reading the most widespread archival TIFF data should probably be prioritized), and to ensure that we are all working in a coordinated fashion.

LZW has been the default compression for GDAL's COG driver since 3.4 and is a simple case, so it'd be a reasonable starting point.

I think it's wise to keep this discussion focused on some high-priority and simple cases. I have some questions about edge-cases but don't want to distract from progress here. Where's the best place to raise those questions?

d-v-b · 2025-05-16T13:19:53Z

I have some questions about edge-cases but don't want to distract from progress here. Where's the best place to raise those questions?

the zarr zulip or a github discussion here could both work. i guess it depends on the scope you assign to these edge cases :)

jbms · 2025-05-16T16:47:18Z

@cgohlke

jbms · 2025-05-16T16:47:30Z

@laramiel

jbms · 2025-05-16T16:54:14Z

The easiest thing is probably to just create a draft extension for specific codecs.

True image formats like jpeg pose the most challenges simply because they support a lot of parameters (see e.g. https://github.com/cgohlke/imagecodecs/blob/8776afc59c94b2531aaecb5dd969308388c9e2d0/imagecodecs/numcodecs.py#L954). We can use the existing imagecodecs definitions for guidance but may want to revise them and try to keep things standard across image format codecs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

image codecs #15

image codecs #15

d-v-b commented May 16, 2025 •

edited

Loading

maxrjones commented May 16, 2025

d-v-b commented May 16, 2025

jbms commented May 16, 2025

jbms commented May 16, 2025

jbms commented May 16, 2025

image codecs #15

image codecs #15

Comments

d-v-b commented May 16, 2025 • edited Loading

maxrjones commented May 16, 2025

d-v-b commented May 16, 2025

jbms commented May 16, 2025

jbms commented May 16, 2025

jbms commented May 16, 2025

d-v-b commented May 16, 2025 •

edited

Loading