Hash Tree

By “hash” in the following we always mean a secure collision resistant hash, and probably SHA-1.

An array of data is divided into equal size hunks which are each hashed and those hashes collected into an array, which itself is hashed by hunks until one root hash is produced for the entire original array. We imagine the root at the top and the pages at the bottom.

More particularly the original array is divided into pages and the lowest hash array is one hash per page. The hash arrays are themselves hashed by 16s for the higher hash levels. I will call these hash groups groups for the more natural name, “nodes”, conflicts with a term that we shall need in this context.

The hash path from the root to a page is a sequence of groups that starts at the top and extends to the page, each containing the hash of the next. We also have the concept of a path portion which is a contiguous part of the hash path down to and including the page.

Ralph Merkle’s thesis covers many of these ideas and I have not read it. Some of the ideas here undoubtedly come indirectly from Ralph however.