|
| 1 | +--- |
| 2 | +title: "Cookbook: Leveraging Arrays with WOQL" |
| 3 | +nextjs: |
| 4 | + metadata: |
| 5 | + title: "Cookbook: Leveraging Arrays with WOQL" |
| 6 | + description: Learn to query and manipulate TerminusDB multidimensional arrays using WOQL patterns for efficient data access and processing |
| 7 | + keywords: arrays, WOQL, multidimensional arrays, data querying, array manipulation, terminusdb |
| 8 | + openGraph: |
| 9 | + images: https://assets.terminusdb.com/docs/technical-documentation-terminuscms-og.png |
| 10 | +media: [] |
| 11 | +--- |
| 12 | + |
| 13 | +## Overview |
| 14 | + |
| 15 | +TerminusDB's multidimensional arrays are powerful structures for storing ordered collections with random access capabilities. This guide teaches you how to effectively query and manipulate arrays using WOQL (Web Object Query Language). |
| 16 | + |
| 17 | +Arrays in TerminusDB are implemented using intermediate indexed objects with specific triple patterns that allow for efficient multidimensional data access. Understanding these internal structures is key to writing effective WOQL queries. |
| 18 | + |
| 19 | +## Understanding Array Storage |
| 20 | + |
| 21 | +### Internal Triple Structure |
| 22 | + |
| 23 | +TerminusDB stores arrays using a specific triple pattern with intermediate objects: |
| 24 | + |
| 25 | +- **`sys:value`**: Contains the actual array element value |
| 26 | +- **`sys:index`**: First dimension index (0-based) |
| 27 | +- **`sys:index2`**: Second dimension index (for 2D arrays) |
| 28 | +- **`sys:indexN`**: Nth dimension index (for N-dimensional arrays) |
| 29 | + |
| 30 | +### Schema Definition |
| 31 | + |
| 32 | +Here's how to define an array in your schema: |
| 33 | + |
| 34 | +```json |
| 35 | +{ |
| 36 | + "@type": "@context", |
| 37 | + "@base": "http://i/", |
| 38 | + "@schema": "http://s/" |
| 39 | +} |
| 40 | + |
| 41 | +{ |
| 42 | + "@id": "DataMatrix", |
| 43 | + "@type": "Class", |
| 44 | + "@key": {"@type": "Random"}, |
| 45 | + "name": "xsd:string", |
| 46 | + "measurements": { |
| 47 | + "@type": "Array", |
| 48 | + "@dimensions": 2, |
| 49 | + "@class": "xsd:decimal" |
| 50 | + } |
| 51 | +} |
| 52 | +``` |
| 53 | + |
| 54 | +## Basic Array Querying Patterns |
| 55 | + |
| 56 | +### Pattern 1: Finding Array Elements by Value |
| 57 | + |
| 58 | +To find array elements with specific values, use the internal storage pattern: |
| 59 | + |
| 60 | +```javascript |
| 61 | +let v = Vars("element", "index1", "index2", "value") |
| 62 | + |
| 63 | +triple(v.element, "sys:value", v.value) |
| 64 | + .triple(v.element, "sys:index", v.index1) |
| 65 | + .triple(v.element, "sys:index2", v.index2) |
| 66 | + .eq(v.value, 42) |
| 67 | +``` |
| 68 | + |
| 69 | +**What this does**: Finds all array elements where the value equals 42, returning their coordinates. |
| 70 | + |
| 71 | +### Pattern 2: Accessing Elements by Index |
| 72 | + |
| 73 | +Array dimensions use a non-negative integer as the data type and needs to be queried explicitly. To retrieve a specific array element by its coordinates, use below snippet. |
| 74 | + |
| 75 | +Variables are expressed using the implicit style, with the `v:` prefix in this example. |
| 76 | + |
| 77 | +```javascript |
| 78 | +triple("v:doc", "measurements", "v:element") |
| 79 | + .triple("v:element", "sys:index", literal(0, "xsd:nonNegativeInteger")) |
| 80 | + .triple("v:element", "sys:index2", literal(1, "xsd:nonNegativeInteger")) |
| 81 | + .triple("v:element", "sys:value", "v:value") |
| 82 | +``` |
| 83 | + |
| 84 | +**What this does**: Gets the value at position [0,1] in the measurements array. |
| 85 | + |
| 86 | +### Pattern 3: Range Queries on Array Indices |
| 87 | + |
| 88 | +For multidimensional range queries: |
| 89 | + |
| 90 | +```javascript |
| 91 | +let v = Vars("element", "index1", "index2", "value") |
| 92 | + |
| 93 | +triple(v.element, "sys:value", v.value) |
| 94 | + .triple(v.element, "sys:index", v.index1) |
| 95 | + .triple(v.element, "sys:index2", v.index2) |
| 96 | + .greater(v.index1, 2) |
| 97 | + .less(v.index2, 5) |
| 98 | + .greater(v.value, 10) |
| 99 | +``` |
| 100 | + |
| 101 | +**What this does**: Finds elements where first index > 2, second index < 5, and value > 10. |
| 102 | + |
| 103 | +## Advanced Array Operations |
| 104 | + |
| 105 | +### Sparse Array Handling |
| 106 | + |
| 107 | +Arrays can have gaps (null values). To handle sparse arrays without failing the query or subquery, use the opt(ional) pattern: |
| 108 | + |
| 109 | +```javascript |
| 110 | +let v = Vars("row", "col", "hasValue") |
| 111 | + |
| 112 | +// Check if position [row, col] has a value, if left unbound, will return |
| 113 | +// null on matches with missing values (given optional pattern) |
| 114 | +and( |
| 115 | + triple("v:doc", "measurement", "v:element"), |
| 116 | + opt(). |
| 117 | + and( |
| 118 | + triple(v.element, "sys:index", v.row), |
| 119 | + triple(v.element, "sys:index2", v.col), |
| 120 | + triple(v.element, "sys:value", v.hasValue) |
| 121 | + ) |
| 122 | +) |
| 123 | +``` |
| 124 | + |
| 125 | +## Practical Example |
| 126 | + |
| 127 | +### Example: 3D Array Navigation |
| 128 | + |
| 129 | +Working with 3-dimensional arrays (e.g., time series data): |
| 130 | + |
| 131 | +```javascript |
| 132 | +let v = Vars("element", "x", "y", "time", "value") |
| 133 | + |
| 134 | +triple(v.element, "sys:index", v.x) // X coordinate |
| 135 | + .triple(v.element, "sys:index2", v.y) // Y coordinate |
| 136 | + .triple(v.element, "sys:index3", v.time) // Time dimension |
| 137 | + .triple(v.element, "sys:value", v.value) |
| 138 | + .eq(v.time, 5) // Specific time slice |
| 139 | +``` |
| 140 | + |
| 141 | +## Performance Tips & Best Practices |
| 142 | + |
| 143 | +TerminusDB uses auto-indexed values using succinct datastructures and ordered storage which makes lookups very fast. This applies for both arrays and their dimensions and all other data. |
| 144 | + |
| 145 | +Because of this, it is not necessary to create specific indexes in TerminusDB, instead in-memory storage techniques are used by the storage engine to quickly find linked values. |
| 146 | + |
| 147 | +That said, there are still performance optimizations that are possible to limit the cardinality of the unification by the engine. |
| 148 | + |
| 149 | +1. **Use specific index constraints early** in your query to limit the search space. |
| 150 | + |
| 151 | +### Query Optimization Patterns |
| 152 | + |
| 153 | +#### Pattern A: Index-First Querying |
| 154 | +```javascript |
| 155 | +// Good: Start with index constraints |
| 156 | +triple(v.element, "sys:index", v.targetRow) |
| 157 | + .triple(v.element, "sys:index2", v.col) |
| 158 | + .triple(v.element, "sys:value", v.value) |
| 159 | +``` |
| 160 | + |
| 161 | +#### Pattern B: Value-Based Filtering |
| 162 | +```javascript |
| 163 | +// When searching by value, avoid searches and instead use the exact value search through the succinct auto-indexing, by placing the value search first and constraints later if any. |
| 164 | +triple(v.element, "sys:value", v.targetValue) |
| 165 | + .triple(v.element, "sys:index", v.row) |
| 166 | + .triple(v.element, "sys:index2", v.col) |
| 167 | + .greater(v.row, 0) // Add meaningful constraints |
| 168 | +``` |
| 169 | + |
| 170 | +### Memory Considerations |
| 171 | + |
| 172 | +- **Large arrays**: Consider pagination using `limit()` and `start()` |
| 173 | +- **Sparse arrays**: Use `opt()` patterns to handle missing values gracefully |
| 174 | + |
| 175 | + |
| 176 | +## Debugging Array Queries |
| 177 | + |
| 178 | +### Inspecting Array Structure |
| 179 | + |
| 180 | +To understand how your array is stored: |
| 181 | + |
| 182 | +```javascript |
| 183 | +let v = Vars("element", "prop", "val") |
| 184 | + |
| 185 | +// View all array element properties |
| 186 | +triple(v.element, v.prop, v.val) |
| 187 | + .re("sys:(index|value)", v.prop) // Only sys properties |
| 188 | +``` |
| 189 | + |
| 190 | + |
| 191 | +## Error Handling & Edge Cases |
| 192 | + |
| 193 | +### Multiple variables for multiple matches with all of |
| 194 | + |
| 195 | +When making an all_of match against values, it may be necessary to use multiple variables, one for each match. The reason for this is that the engine will only match against the first variable which will bind to that subject. The consequence is that it will not match another value. |
| 196 | + |
| 197 | +The aim is to have solutions on a single row, which means that every variable need to be bound independently. More avanced solutions are left as an exercise to the reader. |
| 198 | + |
| 199 | +```javascript |
| 200 | +and( |
| 201 | + triple("v:doc_subject", "measurements","v:arr_subject1"), |
| 202 | + triple("v:doc_subject", "measurements","v:arr_subject2"), |
| 203 | + select(""). |
| 204 | + and( |
| 205 | + eq("v:pos_1_1", literal("1,1", "xsd:string")), |
| 206 | + eq("v:pos_1_2", literal("1,2", "xsd:string")), |
| 207 | + and( |
| 208 | + triple("v:arr_subject1", "sys:index", literal(0, "xsd:nonNegativeInteger")), |
| 209 | + triple("v:arr_subject1", "sys:index2", literal(0, "xsd:nonNegativeInteger")), |
| 210 | + triple("v:arr_subject1", "sys:index3", literal(0, "xsd:nonNegativeInteger")), |
| 211 | + triple("v:arr_subject1", "sys:value", "v:pos_1_1"), |
| 212 | + type_of("v:pos_1_1", "v:v1_type") |
| 213 | + ), |
| 214 | + and( |
| 215 | + triple("v:arr_subject2", "sys:index", literal(1, "xsd:nonNegativeInteger")), |
| 216 | + triple("v:arr_subject2", "sys:index2", literal(0, "xsd:nonNegativeInteger")), |
| 217 | + triple("v:arr_subject2", "sys:index3", literal(0, "xsd:nonNegativeInteger")), |
| 218 | + triple("v:arr_subject2", "sys:value", "v:pos_1_2"), |
| 219 | + ) |
| 220 | + ) |
| 221 | +) |
| 222 | +``` |
| 223 | + |
| 224 | +### Handling Missing Elements |
| 225 | + |
| 226 | +Always use optional patterns when element existence is uncertain: |
| 227 | + |
| 228 | +```javascript |
| 229 | +let v = Vars("doc", "element", "value") |
| 230 | + |
| 231 | +triple(v.doc, "@id", "MyDocument") |
| 232 | + .opt( |
| 233 | + and( |
| 234 | + triple(v.doc, "measurements", v.element), |
| 235 | + triple(v.element, "sys:index", 0), |
| 236 | + triple(v.element, "sys:index2", 0), |
| 237 | + triple(v.element, "sys:value", v.value) |
| 238 | + ) |
| 239 | + ) |
| 240 | +``` |
| 241 | + |
| 242 | +## Summary |
| 243 | + |
| 244 | +Working with arrays in WOQL requires understanding the underlying triple storage pattern. Key takeaways: |
| 245 | + |
| 246 | +- **Arrays use `sys:value`, `sys:index`, `sys:index2`, etc. for storage** |
| 247 | +- **Start queries with index constraints for better performance** |
| 248 | +- **Use optional patterns for sparse arrays** |
| 249 | +- **Consider memory usage with large multidimensional arrays** |
| 250 | +- **Debug by examining the raw triple structure** |
| 251 | + |
| 252 | +Master these patterns and you'll be able to efficiently query multidimensional array structures in TerminusDB using WOQL. |
0 commit comments