Blockchain technology is a novel innovation that has brought about the much-needed stability and security in the cyber world. The competency of this decentralized and distributed ledger system has made it one of fastest growing technologies today, leading to its implementation in various fields. In fact, it has played a significant role in the growth of safe and secure cryptocurrency transaction.
Given the popularity of the technology, it’s imperative that we know more about how data is organized and stored in Blockchains. Blockchain data structure is mainly hash pointer based and involves block as the main data structure. Data structures help in the organisation and storage of data in a way that they can be easily accessed and modified.
Broadly speaking, blockchain data structure can be described as a back linked list of transaction, arranged in blocks. They can be stored in simple databases or in the form of flat files. Each block is identified with the help of hash in the block header.
What is a Block?
A block is considered to be the prime blockchain data structure. The different blocks in a blockchain are identified with the help of a hash in the block header, which is generated cryptographically with the help of SHA256 algorithm. This cryptographic hash function is developed from a mathematical algorithm that maps data of arbitrary size into a 32 byte string.
The use of hash functions helps in the identification of data from a data set. The most basic of hash functions can map any data from arbitrary to fixed size and store them as hash values or hash codes in a hash table. This provides for an effective indexing mechanism ensuring instant availability of the required data.
The fact that hash is collision-free adds to the efficiency of the mechanism. Hence it is impossible to find multiple data corresponding to a single hash value. This quality of blockchain data structure provides for easy identification and verification of integrity.
Blocks are container data structures that help in bringing together of transactions to be included in the public ledgers. Their structure includes a header and a long list of transactions. The header stores metadata, which includes the data about the different data stored in the header. The header of each block consists of the following fields:-
- Index – which indicates the position of the block inside the block chain. The first block is indexed ‘0’, the next ‘1’, and so on.
- Hash – hash function enables the speedy identification of data in the dataset
- Previous hash – every block in blockchain data structure, is linked with its predecessors. This feature contributes to its immutability as a change in the arrangement of blocks warrants a change in the whole blockchain leading to a whole lot of computation, which is not a feasible option.
- numTx – this stores a count of the number of transaction added in the block.
- Timestamp – stores the time details of when the block was created.
- Nonce – stores the integer (32 or 64bits) that are used in the mining process
- Transaction – This is another field stored as arrays in the body of the block. They store the complete summary of transaction performed so far in the block. Here, data storage is done with the help of another data structure called Merkle trees.
So what are Merkle trees?
A Merkle tree is a type of data structure used for the efficient verification and summarization of large chunks of data. Merkle tree are basically binary trees that contain cryptographic hashes. Hence they are also known as Binary Hash Tree. They help vastly in producing an overall fingerprint of the transaction conducted so far in the block.
Merkle tree follows a bottom up structure. The data regarding the transactions are not directly stored on the Merkle tree nodes. They are initially hashed with the help of Secure Hashing Algorithm and the hash is stored at each leaf node. Every transaction is hashed cryptographically with the help of hashing algorithm and the hashes are added as a node of the tree.
Consecutive nodes are clubbed together by concatenation of the hashes. The concatenation and hashing together of nodes continue until a single node is formed at the top. This node is known as Merkle root and it contains the summary of all transactions in the block header. As Merkle tree comes under binary tree data structure, in the event of an odd number of transaction summary, the last node gets duplicated.
The transaction section of the block contains a hash and a field specifying the type of transaction. Some of the different types of transactions include fee transaction, coin base transaction and regular transaction.
Accessing transaction data from Merkle Tree
The speed and ease of data identification and access play an important role in the efficiency of any transaction. The data regarding a particular transaction is accessed by the identification of path that connects the node to the root. This path is known as Merkle path or authentication path.
Transaction Input is basically an array data structure, which manages the details of tokens that we are planning to spend. To be more precise, this array points to the details of the token that is to be spent, like the id of the transaction where the token came from, its index on the array of transaction output, the value or amount of token to be spent and the unlocking script which is a cryptographically protected digital signature.
Transaction output is another array included in blockchain data structure. It stores the details of where the tokens are going to be spent. It includes the details of the amount of tokens to be transferred or spent, along with the locking script. The locking script consists of a combination script operation codes and address of the destination where the tokens are to be transferred. These scripts help in setting rules at the destination address to where the transaction has to be performed.