Structure of a Block:
A block is a container data structure that aggregates transactions for inclusion
in the public ledger, the blockchain. The block is made of a header,
containing metadata, followed by a long list of transactions that make up the
bulk of its size. The block header is 80 bytes, whereas the average transaction
is at least 250 bytes and the average block contains more than 500
transactions. A complete block, with all transactions, is therefore 1,000 times
larger than the block header.
Block Header:
The block header consists of three sets of block metadata. First, there is a
reference to a previous block hash, which connects this block to the previous
block in the blockchain. The second set of metadata, namely the difficulty,
timestamp, and nonce, relate to the mining competition. The third piece of metadata is the merkle tree root, a data
structure used to efficiently summarize all the transactions in the block.
The nonce, difficulty target, and timestamp are used in the mining process.
Block Identifiers: Block Header Hash and Block Height
The primary identifier of a block is its cryptographic hash, a digital
fingerprint, made by hashing the block header twice through the SHA256
algorithm. The resulting 32-byte hash is called the block hash but is more
accurately the block header hash, because only the block header is used to
compute it. For example,
000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f
is the block hash of the first bitcoin block ever created. The block hash
identifies a block uniquely and unambiguously and can be independently
derived by any node by simply hashing the block header.
Note that the block hash is not actually included inside the block’s data
structure, neither when the block is transmitted on the network, nor when it is
stored on a node’s persistence storage as part of the blockchain. Instead, the
block’s hash is computed by each node as the block is received from the
network. The block hash might be stored in a separate database table as part
of the block’s metadata, to facilitate indexing and faster retrieval of blocks
from disk.
A second way to identify a block is by its position in the blockchain, called
the block height. The first block ever created is at block height 0 (zero) and is
the same block that was previously referenced by the following block hash
000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f .
A block can thus be identified two ways: by referencing the block hash or by
referencing the block height. Each subsequent block added “on top” of that
first block is one position “higher” in the blockchain, like boxes stacked one
on top of the other. The block height on January 1, 2017 was approximately
446,000, meaning there were 446,000 blocks stacked on top of the first block
created in January 2009.
Unlike the block hash, the block height is not a unique identifier. Although a
single block will always have a specific and invariant block height, the reverse is not true — the block height does not always identify a single block.
Two or more blocks might have the same block height, competing for the
same position in the blockchain. The block height is also not a part of the block’s
data structure; it is not stored within the block. Each node dynamically
identifies a block’s position (height) in the blockchain when it is received
from the bitcoin network. The block height might also be stored as metadata
in an indexed database table for faster retrieval.
The Genesis Block:
The first block in the blockchain is called the genesis block and was created
in 2009. It is the common ancestor of all the blocks in the blockchain,
meaning that if you start at any block and follow the chain backward in time,
you will eventually arrive at the genesis block.
Every node always starts with a blockchain of at least one block because the
genesis block is statically encoded within the bitcoin client software, such
that it cannot be altered. Every node always “knows” the genesis block’s hash
and structure, the fixed time it was created, and even the single transaction
within. Thus, every node has the starting point for the blockchain, a secure
“root” from which to build a trusted blockchain.
The following identifier hash belongs to the genesis block:
000000000019d6689c085ae165831e934ff763ae46a2a6c172b3f1b60a8ce26f
You can search for that block hash in any block explorer website, such as
blockchain.info, and you will find a page describing the contents of this
block, with a URL containing that hash:
https://blockchain.info/block/000000000019d6689c085ae165831e934ff763ae46a2a6c172
https://blockexplorer.com/block/000000000019d6689c085ae165831e934ff763
The genesis block contains a hidden message within it. The coinbase
transaction input contains the text “The Times 03/Jan/2009 Chancellor on
brink of second bailout for banks.” This message was intended to offer proof
of the earliest date this block was created, by referencing the headline of the
British newspaper The Times. It also serves as a tongue-in-cheek reminder of
the importance of an independent monetary system, with bitcoin’s launch
occurring at the same time as an unprecedented worldwide monetary crisis.
The message was embedded in the first block by Satoshi Nakamoto, bitcoin’s
creator.
Merkle Trees:
Each block in the bitcoin blockchain contains a summary of all the
transactions in the block using a merkle tree.
A merkle tree, also known as a binary hash tree, is a data structure used for
efficiently summarizing and verifying the integrity of large sets of data.
Merkle trees are binary trees containing cryptographic hashes. The term
“tree” is used in computer science to describe a branching data structure, but
these trees are usually displayed upside down with the “root” at the top and
the “leaves” at the bottom of a diagram.
Merkle trees are used in bitcoin to summarize all the transactions in a block,
producing an overall digital fingerprint of the entire set of transactions,
providing a very efficient process to verify whether a transaction is included
in a block. A merkle tree is constructed by recursively hashing pairs of nodes
until there is only one hash, called the root, or merkle root. The cryptographic
hash algorithm used in bitcoin’s merkle trees is SHA256 applied twice, also
known as double-SHA256.
When N data elements are hashed and summarized in a merkle tree, you can
check to see if any one data element is included in the tree with at most
2*log 2 (N) calculations, making this a very efficient data structure.
Merkle Trees and Simplified Payment Verification(SPV):
Merkle trees are used extensively by SPV nodes. SPV nodes don’t have all
transactions and do not download full blocks, just block headers. In order to
verify that a transaction is included in a block, without having to download
all the transactions in the block, they use an authentication path, or merkle
path.
Consider, for example, an SPV node that is interested in incoming payments
to an address contained in its wallet. The SPV node will establish a bloom
filter on its connections to peers to limit the
transactions received to only those containing addresses of interest. When a
peer sees a transaction that matches the bloom filter, it will send that block
using a merkleblock message. The merkleblock message contains the block
header as well as a merkle path that links the transaction of interest to the
merkle root in the block. The SPV node can use this merkle path to connect
the transaction to the block and verify that the transaction is included in the
block. The SPV node also uses the block header to link the block to the rest
of the blockchain. The combination of these two links, between the
transaction and block, and between the block and blockchain, proves that the
transaction is recorded in the blockchain. All in all, the SPV node will have
received less than a kilobyte of data for the block header and merkle path, an
amount of data that is more than a thousand times less than a full block
(about 1 megabyte currently).