Skip to content

Commit d79dca5

Browse files
committed
Add reshape codec
1 parent a98e5ae commit d79dca5

File tree

2 files changed

+144
-0
lines changed

2 files changed

+144
-0
lines changed

codecs/reshape/README.md

Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
# reshape codec
2+
3+
Defines an `array -> array` codec that performs a reshaping operation.
4+
5+
Note that `reshape` always preserves the (logical) lexicographical order (i.e. C
6+
order traversal) of elements within an array, but may be combined with the
7+
`transpose` codec to both reorder and reshape an array.
8+
9+
## Codec name
10+
11+
The value of the `name` member in the codec object MUST be `reshape`.
12+
13+
## Configuration parameters
14+
15+
### `shape`
16+
An array specifying the size `B_shape[i]` of each dimension `i` of
17+
the *output* array `B` as a function of the shape `A_shape` of the input array
18+
`A`. Each element `shape[i]` must be one of:
19+
20+
- a positive integer `size`, specifying that `B_shape[i] := size`.
21+
22+
- an array of integers `input_dims`, specifying that:
23+
24+
```
25+
B_shape[i] := prod(A_shape[input_dims]]
26+
```.
27+
28+
Specifying the corresponding `input_dims` rather than an explicit `size` is
29+
paticularly useful when using variable-size chunking.
30+
31+
- the special value `-1`, which must occur at most once, specifying that
32+
`B_shape[i]` is determined automatically in order to satisfy the
33+
invariant that `prod(B_shape) == prod(A_shape)`.
34+
35+
Implementations MUST return an error if the invariant
36+
37+
```
38+
prod(B_shape) == prod(A_shape)
39+
```
40+
41+
cannot be satisfied.
42+
43+
Additionally, when `shape[i]` is specified as an array of integers `input_dims`,
44+
implementations MUST return an error if the following constraints are not
45+
satisfied:
46+
47+
- the flattened list of input dimensions, over all elements of `shape`, must be
48+
in monotonically increasing order, i.e. `"shape": [[0, 1], 10, [3, 4]]` is
49+
allowed but the following are NOT allowed:
50+
51+
- `"shape": [[1], [0]]`
52+
- `"shape": [[1, 0], 10, [3, 4]]`
53+
- `"shape": [[3, 4], 10, [0, 1]]` are not allowed.
54+
55+
This constraint serves to avoid confusing `shape` configurations that may
56+
(incorrectly) suggest a transpose, when in fact the `reshape` codec never
57+
performs a transpose.
58+
59+
- If `input_dims` specifies `k > 0` input dimensions:
60+
61+
`prod(B_shape[:i]) == prod(A_shape[input_dims[0]])` and
62+
`prod(B_shape[i+1:]) == prod(A_shape[input_dims[k-1]+1:])`.
63+
64+
This two constraints ensure that if the size of output dimension `i` is
65+
specified by `input_dims`, the coordinates in the input array along `input_dims`
66+
actually correspond to the raveled index along output dimension `i`.
67+
68+
## Example
69+
70+
For example, the array metadata below specifies that the compressor is the Zstd
71+
codec configured with a compression level of 1 and with the checksum stored:
72+
73+
```json
74+
{
75+
"chunk_grid": {
76+
"name": "regular",
77+
"configuration": {
78+
"chunk_shape": [100, 50, 64, 3]
79+
}
80+
},
81+
"codecs": [
82+
{
83+
"name": "reshape",
84+
"configuration": {
85+
"shape": [[0, 1], [2], 3]
86+
}
87+
},
88+
{
89+
"name": "bytes"
90+
"configuration": {"endian": "little"}
91+
}
92+
]
93+
}
94+
```
95+
96+
## Format and algorithm
97+
98+
This is an `array -> array` codec.
99+
100+
The dimensionality of the output array `B` is equal to the length of the `shape`
101+
configuration parameter, and the output shape `B_shape` is determined as
102+
specified above.
103+
104+
As this codec does NOT alter the lexicographical order of elements, the contents
105+
of the output array `B` is related to the contents of the input array `A` by:
106+
`ravel(B) == ravel(A)`.
107+
108+
Implementations should, when possible, construct a virtual view rather than copy
109+
the array.
110+
111+
## Change log
112+
113+
No changes yet.
114+
115+
## Current maintainers
116+
117+
* Jeremy Maitin-Shepard ([@jbms](https://github.com/jbms)), Google

codecs/reshape/schema.json

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
{
2+
"$schema": "https://json-schema.org/draft/2020-12/schema",
3+
"type": "object",
4+
"properties": {
5+
"name": {
6+
"const": "reshape"
7+
},
8+
"configuration": {
9+
"type": "object",
10+
"properties": {
11+
"shape": {
12+
"type": "array",
13+
"items": {
14+
"oneOf": [
15+
{"type": "integer", "minimum": -1},
16+
{"type": "array", "items": {"type": "integer", "minimum": 0}}
17+
]
18+
}
19+
}
20+
},
21+
"required": ["shape"],
22+
"additionalProperties": false
23+
}
24+
},
25+
"required": ["name", "configuration"],
26+
"additionalProperties": false
27+
}

0 commit comments

Comments
 (0)