Blockchain inside out: How Bitcoin works
https://vas3k.com/blog/blockchain/
Today I want to tell you why did blockchain appear, how is the cryptocurrency world organized and why is it the smartest system in terms of pure logic during past years. You’ll find out below.
I am far from the hype around bitcoin or stock exchange courses. Blockchain for me is just a speck of technology. New, strange, tricky, but it seems to be moving the world, unlike other stories. Apparently, it will stick around for a long time.
I wrote this post like if I were explaining blockchain to my parents. Even if my non-techie friends were here surfing over the shoulder, surely, they would figure it out.
Here is my buddy Bill. He’ll help me to illustrate what I am about to say. And if he blows it, we’ll kick him out.
The concept of blockchain was first introduced by Satoshi Nakamoto in his article Bitcoin: A Peer-to-Peer Electronic Cash System. On as little as 8 pages, the author explained the entire system of bitcoin cryptocurrency based on the blockchain algorithm.
Blockchain was born as a constitutive part of the bitcoin system, but its principles can be applied and modified independently. Anyone can stock up a personal blockchain, even from the laptop.
Blockchain is a chain of blocks or, in other words, it’s a linked list. Each entry in this list is linked to the previous one and so on it connects transitively to the very first one. Think of it as train car analogy where each one is chained to the next one. There is a worth reading Russian article by Nikita Likhachev, where the same concept is spelled out for mere newcomers. My analogies are partially borrowed from there.
Let’s consider the example below.
Bill’s friends constantly drain him of money. Bill is kind and very forgetful. A week later, he no longer remembers who did not return the debt and hesitates to ask his friends to remind him. Therefore, one of those days he finally decides to get organized and writes it all down on his handy chalkboard.
From now on, Bill no longer forgets that Max has returned everything and Bob’s debt is over $700 while keeps growing. On Saturday Bill invites Bob into his place for a drink. When Bill turns away to mix a nice drink, Bob wipes off the entry “Lent to Bob: $200” and fills out the empty line with “Bob brought back $500.”
Bill trusts his list. Therefore, he forgets about the debt and loses $700. Bill is disappointed. He decides to lock down his records.
Last year Bill learned about cryptography on the programming course. He still remembers that any string can be turned into an unrecognizable set of characters – a hash. Changing any single character in that hash would completely mess up the entire encrypted content.
Let’s say, adding just a dot at the end of the sequence would make the final hash unrecognized. It gives him an idea!
Bill applies the well-known SHA-256 hash to each record on his chalkboard. Then he scribbles the resulting hash right next to its unencrypted corresponding part. Now he can sleep soundly knowing that his records aren’t been altered. When in doubt, he can always decrypt them and compare with the originals.
But the EVIL RUSSIAN GENIUS IVAN is also skilled in SHA-256 and can change the information with its hash. Especially if the hash is scribbled right on the board next to its unencrypted original.
That’s why, for a better protection, Bill decides not only to encrypt the record itself but to chain it up to the hash from the previous transaction. Now all his following entries depend on the previous ones. If you change one single dot, you will be doomed to recalculate the entire hash cascade tailing below.
Now Bill has a personal linked list.
One day Ivan creeps in at night, changes one of the records and updates hashes for the entire list down to the very bottom. It takes him a great deal of effort, but Bill sleeps soundly and does not know what’s going on. In the morning, Bill discovers a perfectly correct list – all hashes match. But he still feels deceived. How else on earth could he protect himself from the nightmare Ivan?
Bill decides to make Ivan’s life even more complicated. Now, before adding a new entry to his list, Bill will solve a complex mathematical equation. Then he will chain up the solution to the final hash.
Bill is a brilliant math student. But adding a record takes about ten minutes even for him. However, this time is worth it! If Ivan breaks in again, he will have to solve all the equations for each transaction below. There might be dozens of them. This will make him think twice since each record equations are unique and logically tailored to the original content.
However, keeping an eye on the list is still simple. First, you compare the hashes and then check the solutions of the equations with a simple substitution. If the ends meet, the list was not altered.
In reality, the equations are not always smooth. Computers quickly crack them easily. And where to store tons of unique equations? Considering all that, the blockchain inventors came up with a more elegant task. There should be a number (nonce) which final hash of the entire record would start with 10 zeros. The nonce is difficult to dig up but the result can always be examined simply with bare eyes.
Try this. It’s too exhausting to search manually for a hash starting with ten zeros, so let’s try to search for the one starting with two zeros. Write anything in the field Nonce. The game stops as soon as the hash of your input starts with two zeros (00).
Put any characters in the field Nonce until their hash starts with two zeros (00):
Nonce: Hash: bd4aeafc255acfa9df974dd3e68795e6ac7f198856a9d12c922cd0577b6c564fAttempts: 0
Now smart Bill verifies all hashes and additionally(!) makes sure that each of them starts with a specified number of zeros. Nightmare Ivan, even on a powerful laptop, won’t have enough time and patience to calculate all hashes according to the rule.
This financial mechanism invented by Bill is a simple model of the blockchain. Its security is guaranteed by mathematicians. They ensured that hashes cannot be calculated anyhow but by the search for each individual record hash. It is called mining. Let’s take a closer look at how it works.
Our friends liked the idea of keeping a forgery-proof list of “who owes whom”. They also do not want to bother themselves remembering who paid for whom at the bar and how much they still owe to each other: everything is written on the board. They discussed ups and downs of the idea and came to the agreement that now they need to combine one list for all.
But who may be entrusted to run such important bookkeeping? When it comes to money, trust becomes the primary criterion. We would rather not trust strangers with our money. Our ancestors invented the whole banking idea for this very purpose. Later on, it became credible backed by licenses, laws, and insurances of the Central Bank.
Friends trust each other. They choose the most responsible person to do bookkeeping. But what if we deal with strangers? A big city, country, or the whole world, as for instance in bitcoin mining? In this case, no one can trust anyone.
So they came up with an alternative approach: everyone keeps the copy of the list. The attacker will not just have to rewrite one list but to sneak into each house and to rewrite everyone’s list. Then it turns out that someone kept several lists at home that nobody knew about. This is called decentralization.
The downside of this approach is that in order to make new entries you will have to stay in touch with all other participants and to constantly bend their ears about new changes. However, if the participants aren’t humans but even-tempered calculating mechanisms, bothering them ceases to be any problem at all.
There is no single point of trust in this system, and hence no possibility of a bribery or a fraud. All network participants act in accordance with a strict rule: no one trusts anyone. Everyone trusts only personally possessed information. This is the main law of any decentralized network.
When you buy lunch you may enter your debit card PIN allowing the food chain to ask the bank if you have 5 bucks on your account. In other words, you confirm with your PIN a $5 transaction, which the bank confirms or rejects.
Our records like “Lent to Bob: $500” are also transactions. But we have no bank authorizing the person initiating them. How could we verify that Bob on the sly did not insert into the list a new entry “Max owes Bill $100,500”?
For this purpose blockchain uses a mechanism of public and private keys, IT people for a long time use them for authorization in the same SSH.
Briefly how this complicated yet beautiful math works: you generate a pair of large prime numbers on your computer – public and private keys. A private key is considered super-secret because it can decrypt what is encrypted publicly. But it perfectly works backward as well: if you disclose the public key to your friends, they will be able to encrypt with it any message addressed to you so that only you as a private key holder can open and read it. Besides, with the public key, you can verify that the data was encrypted with your private key without decrypting the data itself.
We live in the world of the decentralized Internet where no one can trust anyone. A transaction signed with both private and public keys together is sent to a special place – depository of unconfirmed transactions so that any member on the network could verify that it was you who initiated it and not someone else attempting to steal your money.
This mechanism safeguards openness and security of the network. If in the real world the banks are usually the ones responsible for keeping the money safe, in blockchain this function is delegated to math.
Your public key is the number of your crypto wallet. It means that you can create a wallet for any cryptocurrency without even leaving the network.
Plain users, not interested in learning about private keys, can always get help from online wallet services. Convenient QR codes were invented for the purpose of copying of long public keys.
As you could see, both Bill’s chalkboard and blockchain consist of transaction history only. They do not keep track of balance in each wallet. If they did, we would have to find additional protective measures.
The wallet holder’s identity is verified by the private key alone. But how would other network members know that I have enough money to buy?
Since we do not track the balance, you must prove it. Therefore, the blockchain transaction includes not only your signature and how much you want to spend but also links to the previous transactions in which you received the corresponding amount of money. That is, if you want to spend 400 dollars, you run through all your income and expenses history. You attach the proof of income where you were given 100 + 250 + 50 dollars to your transaction thereby proving that you have 400 dollars.
Each member of the network will double check that you have not attached the income twice and that you haven’t spent those $300 that Max gave you last week.
In blockchain, these transaction-linked earnings are called inputs, and all the recipients of the money are called outputs. Since one of the outputs will most often be you, the sum of all inputs is rarely exactly the same as you want to transfer at a time. In other words, the blockchain transaction looks like “I received 3 and 2 BTC, 4 BTC out of them I want to spend and the remaining 1 BTC want to send back to myself”.
Going a bit forward, you can also specify some little commission for your transaction so that the miners would more actively add it to the blocks. In this case, the miner will get some petty cash, and you’ll get a little less change. Mining is discussed in details below.
The good thing about blockchain is that the inputs do not necessarily have to come from the same wallet. Nothing is checked but the key. If you know the private key of all inputs, you can easily attach them to your transaction and pay off with that money. As if you were paying in a supermarket with several cards at once.
However, if you lose your private key, if your hard drive dies or laptop gets stolen, your bitcoins will be locked out forever. Nobody can use them anymore as input for new transactions. This amount will be unavailable to the entire world forever as if you had burned a bundle of banknotes. There is no “bank” on the network where you could drop a complaint and get a refund for your lost crypto money. And if there were, then the “bank” would have to create a certain additional amount of new bitcoins.
I mentioned that transactions are added to a special “unconfirmed transactions depository”. Why would we need some kind of intermediate entity if we sign all our transactions? Why not write them directly into the blockchain?
Because the signal from point A to point B always travels with delay. Two transactions can choose two different paths. The transaction initiated earlier can reach the recipient later as it followed a longer way. That is how double-spending occurs. The same amount is sent to two recipients at once. And they do not even know about that! This is not how the regular paper bills work.
For a decentralized network where no one can trust anyone, this problem is particularly acute. Here’s how you make sure that one transaction happened exactly before the other one? Ask the sender to attach the dispatch time to it, right? But remember – you can not trust anyone, even the sender. Time on different computers will always vary and there is no way to synchronize it. A copy of the blockchain is stored on every computer on the network that each participant trusts.
So how can you make sure that one transaction was earlier than the other?
The answer is simple: it is impossible. There is no way to confirm the transaction time in a decentralized network. And here comes the third important idea of blockchain invented by Satoshi and called blocks.
Each working computer on the network selects any preferable transactions from a common depository. The first choice is usually given to the highest commission offered. The computer collects the transactions until their total size reaches the agreed limit. In Bitcoin, this limit on the block size equals to 1 MB (after SegWit2x will be 2 MB), and in Bitcoin Cash – 8 MB.
In networks like Ethereum, everything is a little more complicated, the number of transactions per block depends on the computational complexity of the included smart contracts. But the idea remains the same – there is a limit.
The entire blockchain, in fact, is a list of blocks, where each one depends on the previous one. It can track any transaction for the entire history unwinding the blockchain down to the very first record. This list weighs hundreds of gigabytes and should be copy-pasted to all participating computers. However, possessing a copy is not necessary to simply create new transactions and transfer the money. It gets downloaded from all nearest computers on the network as if you were downloading a series from the torrents. The only difference is that new series appear every 10 minutes.
After collecting transactions in the depository your computer starts organizing them into the same type of list Bill had on the board. But it’s structured as a tree – hash records go in pairs, the result is paired again, and so on until there is only one hash left – the root of the tree added to the block. I did not find the answer why it should be necessary a tree form, but I guess it’s just quicker. For details, see The Merkle Tree on the wiki.
The tree structure makes it possible to delete unnecessary (spent) transactions from the block. Let’s say, there are two transactions joined by a hash, and one or both are no longer needed. All they had is given away by other transactions – so these old ones can be deleted, and their hash can be kept, as the result, nothing gets wrecked in the structure. See Chapter 7. Reclaiming the Disk Space” in Satoshi’s article.
Since the actual blockchain is already downloaded, our computer knows exactly what its last blocks. Everything it needs is to add a link to it into the header of the block, hash it all and inform the other computers on the network “look, I made a new block, let’s add it to our blockchain.”
The others should check that the block is built according to the rules and that we did not sneak unnecessary transactions into it. After that, they add it to their chains. Now when all incoming transactions are verified, the blockchain increased by a block and everything goes well, right?
Not really. Thousands of computers are working on the network at the same time. As soon as they assemble a new unit, they rush to report almost simultaneously that their unit was built first. From the previous section, we know that it is impossible to prove who really the leader was in a decentralized network.
Thus, to add the block to the chain, our computers must solve some complicated problem that would take some of their time.
Like in high school, when a class was solving a serious math problem, the answer was submitted simultaneously on the very rare occasions.
For a human, a complicated task is planning some vacation getaway, for the machine – to add a specific number (nonce) to the end of the block. The resulting SHA-256 hash for the entire block would start with 10 zeros. This particular problem must be solved in order to add a block to the Bitcoin network. Requirements for other networks may vary.
So we come closer to the concept of mining that became so popular in recent years.
Bitcoin mining is not some kind of sacred mystery. Mining has nothing to do with digging new bitcoins somewhere on the Internet. It’s called mining when thousands of computers around the world are buzzing in the basements, grinding millions of numbers per second, trying to pick a hash starting with 10 (or even 16) zeros. They do not even need to be on the network for this.
Video cards with their hundreds of parallel cores solve this problem faster than any CPU.
Why exactly 10 zeros? Just because. There’s no point in it. This is what Satoshi offered because this is one of those tasks where there is always a solution. But it certainly cannot be discovered faster than by a long monotonous search of options.
The complexity of mining directly depends on the size of the network, its total capacity. If you create your own blockchain and run it yourself at home on two laptops, then the task should be simpler. You can generate, for example, the hash starting with one zero or the sum of the even bits to be equal to the sum of the odd ones.
One computer will spend decades searching for a hash that starts with 10 zeros. But if you combine thousands of computers into a huge network and search simultaneously, then according to the theory of probability this task will be solved on average in 10 minutes. This is the exact time window for the new blocks to be added to the blockchain. Every 8-12 minutes someone on earth finds a requested hash and gets the privilege of announcing their discovery and thus avoiding the question of who was the first.
To find a correct answer, each computer (according to information of 2017) receives 12.5 BTC – this is the amount of compensation generated by the bitcoin system “from the thin air”. The amount decreases every four years. Technically, it means that each miner always adds another transaction to his block – “create 12.5 BTC and send them to my wallet.” When you hear “the number of bitcoins in the world is limited to 21 million, now 16 million are already being cashed out” – they are mostly spent on the network generated rewards.
Any blockchain exists only while its miners exist.
It is the miners who add the emerging transactions to the blockchain. So if someone tells you that he or she “will make a block for ***”, the first question they should answer is who and why he will be mining there. The most common correct answer is “everyone will be mining because we offer bitcoins for that and the miners’ wallets will grow “. But it does not apply to all projects. For example, if the Ministry of Health creates its own closed blockchain for medical personnel, who will mine it? Therapists on the weekends?
But what benefit will miners have afterward, when compensations run out or become too miserable?
According to the Creator’s idea, by that time people will be believing in the reality of bitcoin and mining will begin to pay off with the number of commissions included in each transaction. In 2012, all commissions were zero, miners worked only for rewards from the blocks. Today, a transaction with a zero commission can hang in the pool for several hours, because there is also a competition, and many people are ready to pay for the speed.
That means that the essence of mining lies in resolving meaningless tasks. Would it be so impossible to redirect all this enormous power to something more useful, e.g. search for a cure for cancer?
The essence of mining is to solve any computational problem. This task should be simple enough that the network participants have a stable probability of finding the answer. Otherwise, it will take the eternity to confirm those transactions. Imagine that at the checkout in the store you need to wait every time for half an hour until the bank confirms your transaction. Nobody would work with such a bank.
But at the same time, the task should be complex so that all users of the network would not find the answer immediately and at the same time. Because if they do, they will announce a lot of parallel blocks with the same transactions to the network. In turn, will trigger the probability of “double spending” we spoke before. Or even worse – a split-up of the entire blockchain into several branches, where no one longer will tell apart confirmed and confirmed transactions.
If the reward in 12.5 BTC is awarded only once every 10 minutes and only to the one who found the block, it turns out I’ll need to burn my video cards for a number of years thinking that one day I will win $ 40,000 (at the current rate)?
This is how bitcoin works. But it was not always like that. Previously, the networks were smaller, the difficulty was lower, and therefore, the probability of finding a hash for a new block single-handedly was higher than now. But also bitcoins themselves were not so expensive.
Today, nobody mines bitcoins individually anymore. Instead, the participants join special groups, mining pools where every miner tries to find the right hash. If someone in the group succeeds the entire reward is split between the participants, depending on the size of their contribution to the mutual work. It turns out that you rush and you lose a penny every week from the total share.
From the other side, individual mining is quite possible on some other networks. Until recently it was easy to mine Ethereum where the blocks are added every 10 seconds. The reward for the block is much lower but the probability of generating some little money is higher.
So even if we burn thousands of video cards and there will be no way out?
Yes, but there are some ideas. I described classic mining called Proof-of-Work here when each machine proves that it worked for the benefit of the entire network by solving meaningless problems with a given probability.
Some guys start building blockchains utilizing other principles of mining. Today, the second most popular concept is Proof-of-Stake (proof of ownership)*. In this kind of mining, the more coins are gathered by the network participant the better is this alpha’s chance to insert his block into a blockchain.
Anyone is free to bring forth other types of mining. As it was suggested, all computers on the network could possibly collaborate on the cancer treatment research with the only difference that you will have to figure out how exactly they are to contribute to the cause. Maybe I could also claim that I was there but with my video card turned off. How to measure each team member’s contribution and his effort made towards? You think it up. If you dare to mine your CancerCoin, be ready for the media at your doorsteps.
Imagine a situation when despite all our theory of probability, two miners still found the right answer at the same time and sent two absolutely faithful blocks down the network. These blocks are guaranteed to be different because even if the miners miraculously pick the same transactions from the pool built absolutely identical trees and guess the same random number (nonce), their hashes will still be different since each one of them will still provide his own wallet number for the reward.
Now we have two valid blocks and again there is a problem of whom to consider the winner. How will the network behave in this case?
The blockchain algorithm specifies that the network participants simply accept the first correct answer they receive. After that, they keep playing by their standards. Both miners will collect their rewards, and all the others will start mining, relying on the latter block each one of them personally received, discarding all other correct replicas. That, in turn, creates two absolutely correct blockchains on the same network. What a paradox!
This is a regular situation when probability theory comes again handy. The network will function in such bifurcated state until the time one of the miners discovers the next block linked to one of these chains. As soon as this block is inserted, the chain becomes longer, and thus one of the blockchain network agreements comes to power: under any circumstances, the longest chain of blocks is accepted as the only true for the entire network.
A short chain despite all its correctness is rejected by all network members. Its transactions return to the pool (if they have not been confirmed in the other transaction), and their processing starts from the zero again. The miner loses his reward because his unit no longer exists.
With the growth of the network, such coincidences from “very unlikely” go into the category of “well, sometimes it happens.” Old-timers remember the cases when a perfect chain of four blocks had been discarded with no regrets.
Three blockchain tail security (end of chain insecurity) rules were presented in order to address the above problem:
Mining fees can be spent only after 20 more confirmed blocks after they were received. For bitcoin, it’s about three hours.
If you received bitcoins, they can be used for an input in new transactions only after 1-5 blocks.
Rules 1 and 2 are only written in the settings of each client. Nobody watches their observance. But the law on the longest chain will still destroy all your transactions if you try to deceive the system without respecting them.
Now when you’ve learned everything about mining, blockchain and the rule of the longest chain, you might have a question: would it be possible to outrun blockchain by building the longest chain on my own, and thereby legalizing all my previous fake transactions?
Suppose you have the most powerful computer on earth! Google and Amazon data centers merged at your disposal and aim collectively at calculating the longest blockchain on the network.
Since you cannot calculate several blocks of the chain in advance because each next block depends on the previous one you decide to count each block as fast as possible on your huge data centers, faster than all other participants in their joined efforts to increase the main blockchain. Would it be possible to outrun them? Probably yes.
If your computing power exceeds 50% of the power of all network participants, then with a 50% probability you can build a longer chain faster than all other network members together. This would be (in theory) a possible way to deceive the blockchain by building a longer chain of transactions. Then all transactions of the real network would be considered incorrect, you would collect your pot of gold and lay yet another cornerstone in the history of the cryptocurrency titled “separation of the blockchain.” It did happen once in the history of Ethereum due to the bug in the code.
But in reality, no data center could be comparable in power to all computers in the world. One and a half billion Chinese with ASICs, another half billion of Indians – that is a HUGE computing power. No one in the world can compete with them alone, even Google.
That would be like running out the door and brainwashing every person in the streets that 1 dollar now costs 1 ruble, indoctrinating the entire world before you get busted in the media. If you accomplished all that, you could even cause the global economy to collapse. In theory, it is possible, right? But in practice, for some reason [wink], nobody ever succeeded that far.
This probability bears the entire blockchain concept. The more participants-miners are involved in a network, the more security and trust exist on the network. Therefore, when in China another large mining farm shuts down, the cryptocurrency course collapses. Everyone is worried that somewhere in the world there is an evil genius who has already gathered a mining pool with a whopping ~ 49% capacity.
In fact, it had already happened several times back in 2014, when one of the mining pools temporarily became more powerful than the rest of the network. Luckily, there were no manipulations reported regarding this issue.
Blockchain is not a strictly defined set of algorithms. It is a robust model for building up a forgery-proof decentralized network where no one can trust anyone. I am pretty sure that while you were reading this text, you kept thinking about your own list of possible blockchain applications and how this very idea could be utilized better or socially more responsible. It means that you understand blockchain, I congratulate you.
Some folks worldwide also understood it and decided to improve or adapt it to certain specific needs. Cryptocurrencies aren’t everything the world wants even though they get very diverse too. Below is a short list of some ideas and projects gaining a certain popularity due to a rethinking of the concept of blockchain.
Last updated