Expand description
Low-level tree manipulations and other sharp tools
The target audience for this module is projects like Bao, which work directly with the interior hashes (“chaining values”) of BLAKE3 chunks and subtrees. For example, you could use these functions to implement a BitTorrent-like protocol using the BLAKE3 tree structure, or to hash an input that’s distributed across different machines. These use cases are advanced, and most applications don’t need this module. Also:
Warning: This module is hazardous material. If you’ve heard folks say don’t roll your
own crypto, this is the sort of thing they’re talking about. These functions have complicated
requirements, and any mistakes will give you garbage output and/or break the security
properties that BLAKE3 is supposed to have. Read section 2.1 of the BLAKE3
paper to understand the
tree structure you need to maintain. Test your code against blake3::hash
and make sure you can get the same outputs for lots of different
inputs.
On the other hand:
Encouragement: Playing with these functions is a great way to learn how BLAKE3 works on the inside. Have fun!
The main entrypoint for this module is the HasherExt
trait, particularly the
set_input_offset
and
finalize_non_root
methods. These let you compute the chaining
values of individual chunks or subtrees. You then combine these chaining values into larger
subtrees using merge_subtrees_non_root
and finally (once at the very top)
merge_subtrees_root
or merge_subtrees_root_xof
.
§Examples
Here’s an example of computing all the interior hashes in a 3-chunk tree:
root
/ \
parent \
/ \ \
chunk0 chunk1 chunk2
use blake3::{Hasher, CHUNK_LEN};
use blake3::hazmat::{merge_subtrees_non_root, merge_subtrees_root, Mode};
use blake3::hazmat::HasherExt; // an extension trait for Hasher
let chunk0 = [b'a'; CHUNK_LEN];
let chunk1 = [b'b'; CHUNK_LEN];
let chunk2 = [b'c'; 42]; // The final chunk can be short.
// Compute the non-root hashes ("chaining values") of all three chunks. Chunks or subtrees
// that don't begin at the start of the input use `set_input_offset` to say where they begin.
let chunk0_cv = Hasher::new()
// .set_input_offset(0) is the default.
.update(&chunk0)
.finalize_non_root();
let chunk1_cv = Hasher::new()
.set_input_offset(CHUNK_LEN as u64)
.update(&chunk1)
.finalize_non_root();
let chunk2_cv = Hasher::new()
.set_input_offset(2 * CHUNK_LEN as u64)
.update(&chunk2)
.finalize_non_root();
// Join the first two chunks with a non-root parent node and compute its chaining value.
let parent_cv = merge_subtrees_non_root(&chunk0_cv, &chunk1_cv, Mode::Hash);
// Join that parent node and the third chunk with a root parent node and compute the hash.
let root_hash = merge_subtrees_root(&parent_cv, &chunk2_cv, Mode::Hash);
// Double check that we got the right answer.
let mut combined_input = Vec::new();
combined_input.extend_from_slice(&chunk0);
combined_input.extend_from_slice(&chunk1);
combined_input.extend_from_slice(&chunk2);
assert_eq!(root_hash, blake3::hash(&combined_input));
Hashing many chunks together is important for performance, because it allows the implementation
to use SIMD parallelism internally. (AVX-512 for
example needs 16 chunks to really get going.) We can reproduce parent_cv
by hashing chunk0
and chunk1
at the same time:
let left_subtree_cv = Hasher::new()
// .set_input_offset(0) is the default.
.update(&combined_input[..2 * CHUNK_LEN])
.finalize_non_root();
assert_eq!(left_subtree_cv, parent_cv);
// Using multiple updates gives the same answer, though it's not as efficient.
let mut subtree_hasher = Hasher::new();
// Again, .set_input_offset(0) is the default.
subtree_hasher.update(&chunk0);
subtree_hasher.update(&chunk1);
assert_eq!(left_subtree_cv, subtree_hasher.finalize_non_root());
However, hashing multiple chunks together must respect the overall tree structure. Hashing
chunk0
and chunk1
together is valid, but hashing chunk1
and chunk2
together is
incorrect and gives a garbage result that will never match a standard BLAKE3 hash. The
implementation includes a few best-effort asserts to catch some of these mistakes, but these
checks aren’t guaranteed. For example, this second call to update
currently panics:
let oops = Hasher::new()
.set_input_offset(CHUNK_LEN as u64)
.update(&chunk1)
// PANIC: "the subtree starting at 1024 contains at most 1024 bytes"
.update(&chunk2)
.finalize_non_root();
For more on valid tree structures, see the docs for and left_subtree_len
and
max_subtree_len
, and see section 2.1 of the BLAKE3
paper. Note that the
merging functions (merge_subtrees_root
and friends) don’t know the shape of the left and
right subtrees you’re giving them, and they can’t help you catch mistakes. The best way to
catch mistakes with these is to compare your root output to the blake3::hash
of the same input.
Enums§
- Mode
- The
mode
argument tomerge_subtrees_root
and friends
Traits§
Functions§
- hash_
derive_ key_ context - Hash a
derive_key
context string and return aContextKey
. - left_
subtree_ len - Given the length in bytes of either a complete input or a subtree input, return the number of bytes that belong to its left child subtree. The rest belong to its right child subtree.
- max_
subtree_ len - The maximum length of a subtree in bytes, given its starting offset in bytes
- merge_
subtrees_ non_ root - Compute a non-root parent node chaining value from two child chaining values.
- merge_
subtrees_ root - Compute a root hash from two child chaining values.
- merge_
subtrees_ root_ xof - Build a root
OutputReader
from two child chaining values.
Type Aliases§
- Chaining
Value - “Chaining value” is the academic term for a non-root or non-final hash.
- Context
Key - An alias to distinguish
hash_derive_key_context
outputs from other keys.