|
| 1 | +--- |
| 2 | +hip: 1193 |
| 3 | +title: Record Stream to Block Stream Cutover |
| 4 | +author: Mark Blackman <mark@hashgraph.com> |
| 5 | +working-group: Steven Sheehy <@steven-sheehy>,Joseph Sinclair <Joseph Sinclair<@jsync-swirlds>, Dan Alvizu <dan.alvizu@swirldslabs.com>, Jasper Potts <jasper.potts@swirldslabs.com>, Keith Kowal <keith.kowal@swirldslabs.com> |
| 6 | +requested-by: |
| 7 | +type: Standards Track |
| 8 | +category: Mirror |
| 9 | +needs-hedera-review: Yes |
| 10 | +hedera-review-date: |
| 11 | +hedera-approval-status: |
| 12 | +needs-hiero-approval: Yes |
| 13 | +status: Draft |
| 14 | +created: 2025-05-08 |
| 15 | +discussions-to: https://github.com/hiero-ledger/hiero-improvement-proposals/discussions/1192 |
| 16 | +updated: 2025-05-08 |
| 17 | +requires: 1056 |
| 18 | +replaces: 679 |
| 19 | +superseded-by: |
| 20 | +--- |
| 21 | + |
| 22 | +## Abstract |
| 23 | +This HIP defines the requirements and implementation details for transitioning the Hedera network from record and event streams to block streams. This transition is a critical step toward enhancing blockchain compatibility, improving data integrity, and enabling future network optimizations such as state proofs. The proposal outlines the coordinated changes required across consensus nodes, mirror nodes, and supporting infrastructure to ensure a clean cutover without service disruption. It specifies new storage paths, file formats, and mechanisms to handle the transition state, including a marker file approach to indicate the final record stream entry before blocks begin. |
| 24 | + |
| 25 | +## Motivation |
| 26 | +As highlighted in the Block Streams HIP:1056, the Hedera network currently produces record streams as its primary method of exposing transaction data to mirror nodes and external consumers. While functional, this approach diverges from standard blockchain architectures that organize data into blocks. By transitioning to block streams, Hedera will: |
| 27 | + |
| 28 | +1. Improve blockchain compatibility and interoperability with existing blockchain tools and infrastructure |
| 29 | +2. Provide better data integrity through block-level proofs and signatures |
| 30 | +3. Reduce storage costs through optimized block organization and eventual deduplication |
| 31 | +4. Enable future improvements like block-based state proofs and block node architecture |
| 32 | +5. Simplify the developer experience by aligning with familiar blockchain concepts |
| 33 | + |
| 34 | +This transition is a prerequisite for the planned block node architecture and represents an important step in Hedera's technical evolution. |
| 35 | + |
| 36 | +## Rationale |
| 37 | +The transition from record streams to block streams requires careful planning to ensure network continuity, data integrity, and backward compatibility. This HIP proposes a clean cutover approach with the following key design decisions: |
| 38 | + |
| 39 | +### Bucket Structure and File Format |
| 40 | + |
| 41 | +The new block stream files will follow a revised path structure that includes network, shard, realm, and node ID components: |
| 42 | + |
| 43 | +``` |
| 44 | +block/{network}-{YYYY-MM-DDTHH:mm}/{realm}/{shard}/{nodeID}/0000000000000000000000000000000000000.blk.gz |
| 45 | +``` |
| 46 | + |
| 47 | +This structure provides: |
| 48 | + |
| 49 | +- Clear separation from existing record streams |
| 50 | +- Support for sharding |
| 51 | +- Node identification to identify source of block |
| 52 | +- Block number-based naming for sequential processing |
| 53 | +- ISO timestamp format of last network reset (optional field for resettable networks) |
| 54 | + |
| 55 | +Block files will use the `.blk.zstd` extension and Zstandard (Zstd) compression to optimize storage while maintaining reasonable processing performance. |
| 56 | + |
| 57 | +### Cutover Mechanism |
| 58 | + |
| 59 | +A marker file will be used to denote the last record stream entry, providing mirror nodes with a clear signal to transition processing from records to blocks. This approach: |
| 60 | + |
| 61 | +- Creates a clean demarcation point between systems |
| 62 | +- Allows mirror nodes to detect the transition without configuration changes |
| 63 | +- Prevents data loss during the transition |
| 64 | +- Eases mirror node operations, enabling mirror nodes to update mirror node version before block stream cutover and have seamless service continuity. |
| 65 | +- Reduces mirror node costs by not requiring periodically polling for block files. |
| 66 | + |
| 67 | +### Node Requirements |
| 68 | + |
| 69 | +Consensus nodes will: |
| 70 | + |
| 71 | +- Stop producing record and event streams at the designated cutover point |
| 72 | +- Generate marker files to signal the end of record streams |
| 73 | +- Begin producing block streams with proper running hash continuity |
| 74 | +- Update uploaders to use new block paths |
| 75 | + |
| 76 | +Mirror nodes will: |
| 77 | + |
| 78 | +- Implement logic to detect marker files and switch processing modes |
| 79 | +- Support the new block path structure |
| 80 | +- Implement block proof verification using Threshold Signature Scheme (TSS) |
| 81 | +- Support retry mechanisms for failed block ingestion |
| 82 | + |
| 83 | +## User stories |
| 84 | +1. As a mirror node operator, I want a clear signal when to transition from record stream processing to block stream processing, so that I don't miss any transactions during the cutover. |
| 85 | + |
| 86 | +2. As a mirror node operator, I want to understand the new block file format and paths so that I can adjust my infrastructure to process block streams efficiently. |
| 87 | + |
| 88 | +3. As a developer building on Hedera, I want to understand how the transition affects data availability and processing so that my applications continue to function correctly. |
| 89 | + |
| 90 | +4. As a consensus node operator, I want to know exactly what changes are needed to support block streams so that I can prepare for the upgrade. |
| 91 | + |
| 92 | +5. As a mirror node operator I need to know what version of the mirror node software I need to be running and have sufficient time to upgrade to that version before the cutover |
| 93 | + |
| 94 | +6. As a network user, I want the transition to be seamless with no impact on transaction processing or data availability. |
| 95 | + |
| 96 | +7. As a consumer of mirror node data and associated metrics, I want the transition to be seamless with no impact on how I consume mirror node data. |
| 97 | + |
| 98 | +## Specification |
| 99 | +i### 1\. Cutover Point Definition |
| 100 | + |
| 101 | +The transition from record streams to block streams will occur during a Hedera mainnet update boundary, with the following requirements: |
| 102 | + |
| 103 | +- Occurs after a Hedera Freeze Upgrade transaction during Hedera mainnet update window |
| 104 | +- Current round comes to consensus |
| 105 | +- The final record stream will be completed and a marker file will be generated |
| 106 | +- The first block will contain zero transactions to ensure proper formatting without impacting network usage |
| 107 | +- Block continuity will be maintained by carrying forward: |
| 108 | + - The correct block number (incremented by one from the last record) |
| 109 | + - The running hash values \- cryptographic continuity of the Hedera blockchain. |
| 110 | + |
| 111 | +### 2\. Consensus Node Requirements |
| 112 | + |
| 113 | +#### 2.1 Record and Event Stream Termination |
| 114 | + |
| 115 | +Consensus nodes will: |
| 116 | + |
| 117 | +- Perform Freeze operation and stop accepting transactions |
| 118 | +- Freeze round comes to consensus |
| 119 | +- Ensure all signatures are collected and files are properly finalized |
| 120 | +- Produce final record file |
| 121 | +- Stop producing both record and event streams at the designated cutover point |
| 122 | +- Generate marker files to indicate the end of record and event streams |
| 123 | + |
| 124 | +#### 2.2 Block Stream Initialization |
| 125 | + |
| 126 | +Consensus nodes will: |
| 127 | + |
| 128 | +- Begin construction of the first block with zero transactions |
| 129 | +- Apply the correct running hash and block number from the final record |
| 130 | +- Upload to new block path structure for uploaders |
| 131 | + |
| 132 | +NOTE: It is required all event stream format changes are completed prior to the cutover |
| 133 | + |
| 134 | +#### 2.3 Path Structure Implementation |
| 135 | + |
| 136 | +Consensus nodes will implement the new path structure, for example: |
| 137 | + |
| 138 | +``` |
| 139 | +block/mainnet/0/0/10/000000000000000000000000000083406349.blk.gz |
| 140 | +``` |
| 141 | + |
| 142 | +- The network value will identify the network (mainnet, testnet, etc.) |
| 143 | +- The shard and realm values will be included in the directory structure |
| 144 | +- The nodeID will identify the consensus node that produced the block |
| 145 | +- Block files will use a 36-digit zero-padded block number format |
| 146 | +- Files will use gzip compression and the `.blk.gz` extension |
| 147 | + |
| 148 | +### 3\. Mirror Node Requirements |
| 149 | + |
| 150 | +#### 3.1 Processing Logic |
| 151 | + |
| 152 | +Mirror nodes will implement logic to: |
| 153 | + |
| 154 | +- Detect marker files indicating the end of record streams |
| 155 | +- Switch from record stream processing to block stream processing |
| 156 | +- Apply the new bucket directory structure and path conventions |
| 157 | +- Store block data in a compatible format with existing database schemas |
| 158 | + |
| 159 | +#### 3.2 Block Verification |
| 160 | + |
| 161 | +Mirror nodes will: |
| 162 | + |
| 163 | +- Implement block proof verification using Threshold Signature Scheme (TSS) |
| 164 | +- Implement retry mechanisms to handle cases where block verification fails. Should retry the same block number from other node until correctly processed. |
| 165 | + |
| 166 | +#### 3.3 Bootstrap and Recovery |
| 167 | + |
| 168 | +Mirror nodes will: |
| 169 | + |
| 170 | +- Implement logic to process both block and records files. |
| 171 | +- Implement logic to update the timestamp-based startup logic to use blocks rather than timestamps |
| 172 | +- Take PostgreSQL snapshots before the transition to enable mirror node bootstrapping |
| 173 | +- Support recovery in cases where block processing fails |
| 174 | + |
| 175 | +### 4\. Test Plan and Verification |
| 176 | + |
| 177 | +The end-to-end test plan will include: |
| 178 | + |
| 179 | +1. Verification of logic for writing block streams |
| 180 | +2. Verification of block production and correctness |
| 181 | +3. Verification of ingestion and cloud storage |
| 182 | +4. Verification of block proof creation and verification following node upgrade |
| 183 | +5. Verification of Mirror Node transition detection and cutover capability |
| 184 | + |
| 185 | +### 5\. Disaster Recovery Considerations |
| 186 | + |
| 187 | +The following risk scenarios must be addressed: |
| 188 | + |
| 189 | +#### 5.1 Pre-First-Block Risks |
| 190 | + |
| 191 | +- Node failure during cutover |
| 192 | +- Uploader misconfiguration (incorrect path, cloud provider, etc.) |
| 193 | +- No blocks being produced |
| 194 | +- Record or event streams not stopping as intended |
| 195 | + |
| 196 | +#### 5.2 First Block Production Risks |
| 197 | + |
| 198 | +- Incorrect running hash |
| 199 | +- Incorrect block number |
| 200 | +- Malformed block data or schema |
| 201 | +- Invalid block proof |
| 202 | +- Node reconnection issues after maintenance window |
| 203 | + |
| 204 | +### Example Specification |
| 205 | +NA |
| 206 | + |
| 207 | +## Backwards Compatibility |
| 208 | +This transition introduces a fundamental change in how transaction data is exposed by the Hedera network. |
| 209 | + |
| 210 | +To maintain backward compatibility: |
| 211 | + |
| 212 | +Historical record stream data will remain available in its current format and location |
| 213 | +Mirror nodes will support both record stream and block stream processing |
| 214 | +The first block will maintain cryptographic continuity with the last record |
| 215 | + |
| 216 | +Long-term, as the ecosystem transitions to block nodes, historic record streams and cloud bucket access may eventually be deprecated, but this would require a separate HIP and migration plan. |
| 217 | + |
| 218 | +## Security Implications |
| 219 | +The transition to block streams introduces several security considerations: |
| 220 | + |
| 221 | +1. **Data Integrity**: Ensuring the running hash continuity between record streams and block streams is critical to maintain the cryptographic integrity of the ledger. |
| 222 | + |
| 223 | +2. **Block Verification**: Mirror nodes must properly implement the TSS verifier to validate block proofs. |
| 224 | + |
| 225 | +3. **Disaster Recovery**: Proper procedures must be in place to handle node failures or other issues during the cutover. |
| 226 | + |
| 227 | +4. **Data Availability**: Ensuring no transactions are lost during the transition is critical for maintaining network trust. |
| 228 | + |
| 229 | +## How to Teach This |
| 230 | +The transition from record streams to block streams represents a significant architectural change for the Hedera network. To help the community understand this change: |
| 231 | + |
| 232 | +1. **For Mirror Node Operators:** |
| 233 | + |
| 234 | + - Provide detailed documentation on the new block file format and paths |
| 235 | + - Explain how to detect the transition marker and switch processing modes |
| 236 | + - Offer guidance on handling the transition period |
| 237 | + |
| 238 | +2. **For Developers:** |
| 239 | + |
| 240 | + - Explain the conceptual shift from records to blocks |
| 241 | + - Document how transaction data will be organized in the new format |
| 242 | + - Provide examples of how to process block data |
| 243 | + |
| 244 | +3. **For Node Operators:** |
| 245 | + |
| 246 | + - Detail the changes required to consensus node configuration |
| 247 | + - Explain the new uploader paths and configurations |
| 248 | + - Provide a timeline and checklist for the upgrade |
| 249 | + |
| 250 | +## Reference Implementation |
| 251 | +NA |
| 252 | + |
| 253 | +## Rejected Ideas |
| 254 | +None to note |
| 255 | + |
| 256 | +## Open Issues |
| 257 | +Block Stream Protobuf Definitions: The block stream protobuf definitions are still under development. Several open items remain. |
| 258 | + |
| 259 | +Block Deduplication Strategy: A decision is needed on whether to implement deduplication of blocks from multiple nodes before block nodes are fully operational, or to accept the additional storage costs during the transition period. |
| 260 | + |
| 261 | +Timestamp to Block Number Mapping: A solution is needed for mirror nodes to map between timestamps and block numbers, particularly for startup and historical data access. |
| 262 | +## References |
| 263 | +A collection of URLs used as references throughout the HIP. |
| 264 | + |
| 265 | +## Copyright/license |
| 266 | +This document is licensed under the Apache License, Version 2.0 — |
| 267 | +see [LICENSE](../LICENSE) or <https://www.apache.org/licenses/LICENSE-2.0>. |
0 commit comments