May 27, 2017 - Blockchain 1.0. Henry Zhang, Impact Investing Intern. University of Toronto. Edited by William Michael Cunningham.
Many people have a rough idea of how the bitcoin blockchain works, but very few understand precisely how it works. This article is intended to explain blockchain in a clear, simple, and visual way.
Part 1 The Hash and the Blockchain [1]
Before we talk about blockchain, let's talk about a hash [2]. The hash is the foundation of blockchain technology.
A hash is simply a product of a hash function [3]. So what's a hash function? It is a computer program that takes in data of any size, length, format, type, and returns an output with a specific format, length, size, type, etc. This output is known as a hash.
For example, let's say you have a hash function that returns a 3-digit value; input like "adfoadihfopahfeoapfjda" will return a 3-digit value ("391"); input such as "&(*!&$@" will return a 3-digit value ("126"), and an input such as " " will still return a 3-digit value ("792"). The point is, no matter the randomness of the input, the output is always one specific thing.
Note that hash functions are not unique, there are many different versions! The one bitcoin uses is called the SHA-256 Function and it returns hashes that are 256 digits-long hexadecimals [4]) The picture at top left, Figure 1, is a visual explanation of what this means:
In the picture, on the left side are random input (data) differing in size, shape, category, etc. On the right side in the picture is a specifically formatted output, or hash, in this case, it is a five-digit code.
Hashes are unique, one hash represents one specific input. A hash never represents two or more inputs. This makes a hash an effective identifier of a specific input. The SHA-256 hash function used by bitcoin generates hashes of 256 bits. Each bit represents either a 0 or a 1. You can form 2^256 different hashes, more than enough to identify every single thing in the universe!
Hashes are asymmetrical, there is no "inverse hash function". In other words, you can apply the hash formula on some input to return some output. But there is no reverse formula that lets you reverse the output and get back to the original input. The only way to do it would be to use brute force - applying millions/billions of different inputs and see which one returns the correct output.
Hashes are random, if you change the input by a tiny, tiny, little bit, the output will be completely different. This means that you can not tell whether two inputs are correlated based on their similarity in hash. You'll see why this is a favorable characteristic in a second.
So how is all this tied to the blockchain and transactions?
Say we have a transaction described as this: "Jane gave Allen 57.2 bitcoins on September 29, 2017 at 4:35 pm." If we apply a hash function to it, we get a specifically formatted hash, such as something like this: "123910912" (random example, the actual bitcoin hash is a much longer).
This random number "123910912" is a hash: it uniquely identifies this specific transaction statement. It cannot be traced back to the transaction statement, and it cannot be inferred by knowing what a similar hashes stand for (there is absolutely no correlation between "123910912" and "123910913". For example, "123910913"could stand for 'Helen received 25 bitcoins through mining on January 6, 2002 at 1:12 am").
Now things start getting cool:
Hash functions doesn't have to be 1-input-to-1-output. It can be 2-input-to-1-output or 3-input-to-1-output. In other words, hash functions can take in multiple inputs and still generate 1 output. Hashes and hashes can combine to form new hashes. Again, a picture Figure 2, (second from left above) will help clearly illustrate the point.
It's the same idea as in Fig. 1, except this time, previous hashes are the inputs.
This is "blending the hashes." What exactly does "blending the hashes" do? Take hash123: 83513 from Figure 2, as an example.
First, it is a unique identifier of the three-input-combination. (the ball, the ball with a mark, and the square), it is the proof of existence of these three pieces of data.
Second, if all you see is the hash123: 83513, there's no way for you to figure out what the inputs are. And third, hash123: 83513 is not inferable. 83514, 83515, or 83516 does not tell you anything about 83513.
As shown, you can keep combining and combining hashes until you get to one final hash. This final hash is cryptographically secure and it's a proof that all the data went into it exists. The visual outline of this process is known as the Merkle Tree [5], the ultimate, single hash is known as the Merkle Root [6]. (the picture illustrates a 2 branch Merkle Tree, but it could have up to as many branches as needed).
This is how transactions are recorded in a block. Take, say 100 transaction statements, hash them into 100 unique, uninterpretable, un-patterned code, combine them down to 50, then 25, then 13...7...4..2..and finally down to 1; that 1 hash is a proof that ALL 100 transactions took place, and is infinitely close to being 100% secure!
Q. What happens if one hash represents two transactions or two bundles of transactions?
A. For example: What if transaction statement: "Jane gave Allen 57.2 bitcoins on September 29, 2017 at 4:35 pm.", and transaction statement: "Helen received 25 bitcoins through mining on January 6, 2002 at 1:12 am" both are identified by hash: "123910912"? The system breaks down.
Q. What happens if you can trace back to the transaction from the hash?
A. For example: by applying some formula to "123910912" and conclude "Jane gave Allen 57.2 bitcoins on September 29, 2017 at 4:35 pm." You have no privacy, and the information might be used to cause harm.
Q. What happens if there was a pattern to the hashes?
A. For example, knowing what "123910912" is gives you some idea of what "123910911" and "123910913" are? You can pretty much infer a whole bunch of things with one piece of knowledge, which is undesirable.
This is how privacy and integrity of the blockchain can be maintained at once.
Part 2 The miners and the blockchain
A miner creates hashes that represent whole blocks and attaches them to the blockchain (we'll refer to these hashes as "complete-hashes"). The first miner that successfully attaches the next complete-hash wins a prize of some number of bitcoin. The process repeats until all possible bitcoin has been mined (arbitrarily set at 21 million by the system creator, Satoshi).
There are three components that make up the complete-hash. The Merkle root we just described, representing all pending transactions generated by the network after the consolidation of the previous block, the hash of the previous block, and the nonce. We define this term below.
In the bitcoin network, a transaction is considered valid if that transaction has been authorized by the transferrer through his or her digital signature.
Because of the way blocks are setup (All transactions come down to one hash), there is no way for you to validate past transactions by flipping through the old blocks-because you can't reverse the hashes of those blocks! So, instead, you "wax" them together, creating self-evident proof. Here's what I mean in Figure 3, third from left in the pictures above).
The mere existence of the most current block is the proof that all the transactions before it are valid, since you cannot get here without this being the case. Notice that we made a very important assumption: WE ASSUME THAT ALL "CURRENT" TRANSACTIONS WERE VALID IN THE FIRST PLACE. How do you know that? This leads to the discussion of the last component: the nonce.
We need to discuss two things before we get to the nonce:
1. The consensus issue and
2. How to commit fraud.
The consensus issue
Blockchains exist as files on a network of computers connected via the internet or another network protocol. For blockchain to work, every computer on the network needs to have an exact copy of the newest blockchain. When one computer announces an updated version of the blockchain, this "work" is publicized and all the other computers on the network verify this "work".
The newest, most recent blockchain needs to be consistent - all transactions must make sense. This insure that it is impossible to have some extra bitcoins on someone's account that can't be traced back to being mined. The newest blockchain also needs to be accepted by the majority of the computers online (51% of the network participants). After consensus is reached, the newest version of blockchain is accepted by all computers.
When offline computers come online, they'll compare the copy of the blockchain they hold with the newest copy on the network. If the newest copy is longer and consistent, they'll automatically update their copy up to that version.
Fraud
Blockchain network computers only verify that updated transactions are consistent. Most types of fraud are denied because they cause inconsistencies. Note that computers fall short of detecting fraud that is consistent with the transactions (you'll see what I mean in a bit). If a hacker commits this type of fraud, and becomes the first one to announce a fraudulent and consistent update, other computers will accept his work as valid. Once the consensus reach 51% of all participants, the fraud is permanently in the record and unsolvable.
Let's talk about how fraud is committed. Imagine you are a malicious, greedy hacker (just for a while). You want to cheat the system. There are three ways you can cheat:
1. Create some bitcoins out of nowhere;
2. Steal some bitcoins from..say your friend Bob, and
3. Purchase something from Bob with bitcoins, and after receiving that something, create a fake transaction that reverses the initial transfer of bitcoins.
Option 1 will be immediately denied. It is impossible to add unauthorized bitcoins to the system, because the record proves that those bitcoins never existed in anyone's account, and aren't officially mined. Same with option 2. You can't just go in pull out some bitcoins from Bob's account without his authorization (through his digital signature, and hacking a digital signature incredibly hard and impractical). (You can, however, steal his username and password. This isn't hacking the bitcoin system. This is hacking Bob.)
As for option 3, you can't wait too long. Once those transactions are consolidated into the blockchain, you're back in the "'option 2' situation. So, your only option is to reverse the transactions before they get incorporated into the blockchain. What you want to do then, is create a fake reversal transaction, include it in the Merkle root, and be the first one to complete the hash. Other computers will check your hash, make sure it is consistent, and accept all those transactions as valid.
Now you might not be the first one to complete the hash, since many other computers are competing with you. But, based on today's computer processing power, there is a significant chance that you'll be successful. And nothing stops you from keep trying.
Enter the nonce...
Remember the randomness property of a hash? It states that by changing the input a tiny tiny bit, the output hash becomes completely different. A nonce is just a number generated by the computer. Since the final hash is the product of a hash function that takes in: 1. the Merkle root, 2. the previous hash, and 3. the nonce. By changing the nonce a little bit, say from 1 to 2, you end up with a totally different result. See Figure 4.
Now, here's the thing: you can force the result to be a certain way, or satisfy certain criteria, before it can be attached onto the blockchain.
For example, you can require the size of the output hash to be below a certain threshold value. By doing this, you are essentially forcing the computer to do a lot of computational work, as it must try many nonces until it is able to find one that satisfies the criteria.
Now what's the point of this? The answer is that it make it exceptionally, exceptionally hard for hackers to commit the type of fraud described in 3. above (known as transaction-consistent fraud). It becomes impractical.
One important build-in mechanism of the blockchain algorithm is that the more people participating in the activity of extending the blockchain/fighting over the rewarded bitcoins, or so-called mining, the harder the hash function mathematical problem becomes.
The average time to add a new block onto the existing blockchain doesn't change (for bitcoin, it's 10 minutes), but the probability that any particular participant solving out the puzzle diminishes. So, the more people involved, the more unlikely that a hacker will be able fraud the system. As of right now, it is practically impossible to the system.
Besides that, hackers today face another huge challenge: the group known as the 'mining experts"." These people that have the most powerful mining equipment on the network-huge processors that take up a room, and specifically designed to mine bitcoins. The probability that experts mine bitcoin is much higher now than the probability that a regular person will be a miner. Of course, a hacker can become a mining expert, but he'll soon find that mining equipment costs much more than he can recover from reversing transactions.
Forking
Here's a side issue I want to bring up: it is possible for two or more computers to simultaneously solve out the mathematical hash function puzzle, broadcasting two or more versions of the newest blockchain. In this circumstance, all the computers are going to receive all the versions of the blockchain but in a different chronological order. Under this circumstance, blockchain network computers work on whichever one it receives first, and put the other one aside. The tie is broken when one of the versions gets updated before all other versions (either it's the one currently working on or not); All the computers then unify on this version and abandon all the rest.
Glossary
Blockchain: a digital ledger in which transactions are recorded chronologically and made available publicly or to a network.
Hash/Hash Function: A hash function is any mathematical rule that can be used to map data of arbitrary size to data of fixed size. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes.
Hexadecimal: In mathematics and computing, hexadecimal (also base 16, or hex) is a positional numeral system with a radix, or base, of 16. It uses sixteen distinct symbols, most often the symbols 0-9 to represent values zero to nine, and A, B, C, D, E, F (or alternatively a, b, c, d, e, f) to represent values ten to fifteen.
Merkle Tree: a tree in which every non-leaf node is labelled with the hash of the labels or values.
Merkle Root: the hash of all the hashes of all the transactions in the block.
Nonce: an arbitrary number that may only be used once.