Skip to content

image codecs #15

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
d-v-b opened this issue May 16, 2025 · 5 comments
Open

image codecs #15

d-v-b opened this issue May 16, 2025 · 5 comments

Comments

@d-v-b
Copy link
Contributor

d-v-b commented May 16, 2025

imagecodecs defines a trove of useful image compression / decompression routines that are accessible via the numcodecs API.

I think we should consider some or all of these codecs high priority for additions to this repo.

As many image codecs impose strict constraints on the dimensionality of their inputs, we will likely need something like the transpose codec to efficiently transform N-dimensional arrays into a shape consistent with a given codec. I think the forthcoming reshape codec will be of great use here. But there might be other codecs / array transformations that are useful for using general purpose image codecs in a zarr context.

My goal with this issue is to define some priorities (e.g., whatever we need for reading the most widespread archival TIFF data should probably be prioritized), and to ensure that we are all working in a coordinated fashion.

cc'ing people who I know are interested in this development so we can coordinate our efforts. This list is not exhaustive, please feel free to bring additional people into the conversation.

@maxrjones, @mkitti, @jbms

@maxrjones
Copy link
Member

As many image codecs impose strict constraints on the dimensionality of their inputs, we will likely need something like the transpose codec to efficiently transform N-dimensional arrays into a shape consistent with a given codec. I think the forthcoming #10 will be of great use here. But there might be other codecs / array transformations that are useful for using general purpose image codecs in a zarr context.

There are a couple imagecodecs ArrayArrayCodecs where this would matter, but we could start with the BytesBytesCodecs which are independent of the array shape.

My goal with this issue is to define some priorities (e.g., whatever we need for reading the most widespread archival TIFF data should probably be prioritized), and to ensure that we are all working in a coordinated fashion.

LZW has been the default compression for GDAL's COG driver since 3.4 and is a simple case, so it'd be a reasonable starting point.

I think it's wise to keep this discussion focused on some high-priority and simple cases. I have some questions about edge-cases but don't want to distract from progress here. Where's the best place to raise those questions?

@d-v-b
Copy link
Contributor Author

d-v-b commented May 16, 2025

I have some questions about edge-cases but don't want to distract from progress here. Where's the best place to raise those questions?

the zarr zulip or a github discussion here could both work. i guess it depends on the scope you assign to these edge cases :)

@jbms
Copy link
Contributor

jbms commented May 16, 2025

@cgohlke

@jbms
Copy link
Contributor

jbms commented May 16, 2025

@laramiel

@jbms
Copy link
Contributor

jbms commented May 16, 2025

The easiest thing is probably to just create a draft extension for specific codecs.

True image formats like jpeg pose the most challenges simply because they support a lot of parameters (see e.g. https://github.com/cgohlke/imagecodecs/blob/8776afc59c94b2531aaecb5dd969308388c9e2d0/imagecodecs/numcodecs.py#L954). We can use the existing imagecodecs definitions for guidance but may want to revise them and try to keep things standard across image format codecs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants