The Kingstons agreed to dig into more details of cryptocurrency mining asked by Emily. Although the issues are more technical than all the previous conversations the family has had, the conversation style keeps the text short and pointed because anyone can ask any questions along the way. At the end of day, however, they did not go any further than talking about hash, hash function and cryptography.
Greg: Let’s consider the mining question Emily raised at the very beginning. Remember we were talking about a decreasing reward for miners? Turns out that the Bitcoin algorithm also has a changing level of difficulty for mining, which can either go up or down, depending on the number of miners in the game and how active they are mining.
Joy: That’s right. Please don’t feel overwhelmed Emily because this issue is directly related to what mining is. We’ll get there to answer your questions.
Emily: Thank you for telling me that. Meanwhile, I’ll keep asking questions along the way if you don’t mind.
Greg: Of course not! Let me begin from one concept called “difficulty epoch” as discussed by a CoinDesk article. Each epoch corresponds to 2,016 blocks of transactions, which in turn correspond to roughly two weeks. After the current difficulty epoch is reached, the difficulty level will be updated either up or down, based on the number of miners and their activities in the last two weeks.
Joy: Yeah, the goal in the algorithm is to finish verifying a block in ten minutes. If last two weeks have seen less than ten minutes the difficulty level will be up, otherwise down.
Lily: So it looks like not only the entire supply of Bitcoin will reach 21 million coins in 2140 but the pace of releasing new bitcoins is also determined by the algorithms at 10 minutes per block.
Joy: Yeah, the whole process is transparent. You can prove it with the numbers. 10 minutes per block means 6 blocks in one hour, 144 blocks in one day of 24 hours, and 2,016 blocks in two weeks or 14 days because 2,016=144 x 14.
Greg: Not meant to confuse you, but there is another milestone related to the decreasing coin reward to miners. Remember we learned yesterday that four every four years the reward will decrease by half. That four years correspond to 210,000 blocks created or confirmed.
Emily: How do the algorithms control levels of difficulty?
Greg: The key is to change the number of leading zeros in the target hash function.
Emily: Wow, what’s a hash function?
Joy: We talked about hash function before when we were discussing NFTs. A hash function is best to be called a “fingerprint machine.” In math terms, hash function is nothing but algorithm that accepts different inputs or data or messages and then produces uniform sized “fingerprints” that can be called “hashes,” “hash values,” “hash sums” or “message digests.” I can show you a static picture here, but the guy named Haseeb Qureshi does a better job in a blog showing a really neat dynamic cartoon picture.
Greg: Haseeb Qureshi did a good job but this video is by far the best I’ve ever seen after reading, searching and watching many other pieces and sources. It explains a whole bunch of concepts like block, blockchain, hash, hash function, nonce, distributed network, tokens and Coinbase and nicely put them together with a little demonstration that clearly and visually get the ideas crossed. It’s so intuitive that people as young as Jason can understand because everything is right there in front of your eyes. It’s about 18 minutes long but I guarantee that it won’t waste a single minute of your time.
Emily: Wow! I’ve never seen you so excited before. We sure will watch it later.
Greg: Just a bit background information for the video. At the beginning of the video you will see the term “SHA-256,” that means “Secure Hash Algorithm” that produce hashes that are 256 bits long. Bear in mind that when it comes to computer terminologies, we don’t count “words” but only “bits” and “bytes.” One byte is 8 bits. So 256 bits is just 32 bytes.
Emily: Why should we care about these details?
Greg: You don’t have to. We could sit here talking about bits, bytes and hexadecimal numbers all day. But the important thing to keep in mind is that the SHA-256 is a hash function, or more accurately a cryptographic hash function — the machine that makes hashes or “fingerprints.” There are other hash functions but SHA-256 is the most popular one. If you must know one hash function, SHA-256 would be it. Bitcoin uses SHA-256, for example. I learned from this article by n-able.com that “The US government requires its agencies to protect certain sensitive information using SHA-256.”
Kimberly: What make SHA-256 so much liked?
Greg: That article by n-able.com again offers a good summary: (1) one way function from input data to hash and almost impossible to figure out what the input data is even if you know its hash because “A brute-force attack would need to make 2256 attempts to generate the initial data.” (2) Uniqueness of hash, once again because the “2256 possible hash values (more than the number of atoms in the known universe), the likelihood of two being the same is infinitesimally, unimaginably small.” (3) the avalanche effect that “a minor change to the original data alters the hash value so much that it’s not apparent the new hash value is derived from similar data; this is known as the avalanche effect.” These are the major ones; I will add more later when we compare hash function with encryption.
Joy: Now that you mention encryption, I remember reading somewhere that cryptocurrency needs both hashing and encryption.
Greg: That’s true. The nice thing though is that encryption and decryption, hash functions, and digital signature algorithms are all parts of cryptography. This is why Bitcoin and others are all called “cryptocurrencies” because they all use the same foundational technologies of cryptography.
Emily: What does the word “cryptography” mean?
Greg: Good question. This article from Investopedia offers a good explanation. I have it downloaded to my phone. Here it is: “The word ‘crypto’ literally means concealed or secret. ‘Cryptography’ means ‘secret writing’ — the ability to exchange messages that can only be read by the intended recipient.”
Emily: Sounds like cryptography is all about security and confidentiality of information or data.
Greg: More accurately it’s a decentralized way of keeping information, data and transactions secure, a way that does not require a third party or a central authority. Furthermore, the technologies not only provide security but also efficiency. If you watch the video I recommended, you will see that we can put in all the stuff from the Library of Congress and the hash function still produce one hash digest that is only 256 bit long!
Lily: What about encryption and decryption? What are they and why we need them?
Greg: They serve different functions for cryptocurrency. So far we have been talking about cryptocurrency mining, which relies on the hash functions. But when we get to the transactions between digital wallets, encryption and decryption become the key tool.
Kimberly: How do hash function and encryption differ?
Greg: That’s a hot topic and much discussion surrounding it. The first key difference is that encryption is a two way process, meaning you first encrypt the plaintext data to make it secret or unreadable, which is called “ciphertext”, and then you need to decrypt to make it readable.
Joy: This is like sending and receiving a telegraph in the old days.
Greg: Exactly. Say someone’s grandma is seriously sick, she’d rush to the post office and writes down the words “Grandma sick, come home!” to be sent to her brother. The post office worker would first translate, encrypt that is, the words into a bunch of codes, and then send to her brother. Once the telegraph arrives at the receiving post office near where her brother lives, someone else in that post office will translate, decrypt this time, the codes into plain English.
Kimberly: How is the hash function different?
Greg: Hash function always works one way, meaning there is no decryption. Once the input data, messages or transactions were hashed, we don’t want them to be decrypted or to make the hash readable by a human. For a hashing function, such a reverse process — if succeeded — would be a disaster because it means the hash function failed to do its job.
Kimberly: But why is that a good thing? It’s like sending a telegraph out that no one can read and understand, right?
Greg: I thought about that for several days by asking myself the same question as you just did: What if someone needs to read and understand the original input, not just the hash? I searched online using that question and could not find any answer.
Kimberly: So do you have an answer now?
Greg: Suddenly it came to me yesterday that it’s not the right question to ask, because it failed to take the purpose of blockchain data into consideration. When people put transaction records in a blockchain, they want two things: One, keeping those records in the chain permanently and remain unchanged forever. Two, making the authenticity of all the records in each block easily and quickly proven. For those purposes a one way hash function works out perfectly.
Kimberly: Could you elaborate on that?
Greg: It is directly related to a very nice property of the hash function. Not only is it a fingerprints machine, but the machine is extremely efficient. Remember I talked about earlier that even if we put all the books and journals from the Library of Congress into one block — I know that’s impossible given the 1 MB per block data limit set by Satoshi — the SHA-256 hash function would still produce just one hash that is 256 bit long?
Kimberly: Yeah, why is that relevant here?
Greg: Well, keep using the Library of Congress example, and still assume the entire collections were fit into one block, with the hash function working properly, even if I changed one comma in one book — and there are 167 million items when I checked it last night — to a period, the hash will be completely different.
Joy: Yeah, even a tiny change in the input will certainly lead to a completely different hash. This is called “Avalanche Effect.”
Kimberly: Those are impressive, but I still don’t see their relevance with our discussion here.
Greg: Think of it: How would you make sure that none of the records in a block has been tempted? Checking all the records one by one, or just look at the hash function to see if it has been different?
Kimberly: Oh, I see! A hash function makes it very easy to find out any changes made to any record in a block. I can buy that, but what about encryption? Does it do the same thing as a hash function?
Greg: Smart question! If I say “yes,” then we can use encryption to still get what we want. Unfortunately, the answer is “no.” You see, the output from encryption varies, depending on the size of the input. That means longer inputs or “plaintexts” will have longer outputs or “ciphertexts.” This is just like telegraph, the more words you send, the longer the telegraph and more money it costs you.
Joy: Very interesting. The other reason we can use hash function for blockchain is that all hash functions are deterministic, meaning that given the same input, only one hash digest will be produced. Therefore, while it is virtually impossible for anyone else to reproduce the input from hash, whoever owns the input record knows perfectly well what the input is. She does not have to do the “reverse engineering” like a hacker would have to.
Emily: You say, “virtually impossible,” is it still possible to find the input once you have the hash?
Joy: It is possible but highly unlikely. The way to do it is through the so called “brute force attack,” meaning to run through all the possible combinations until hitting the target. This requires great resource with very low chance of success. Remember the reason we were told earlier: A hackers would have to make 2256 attempts to generate the initial data.
Lily: I am thinking of passwords that we use every day. They are usually not very long, so which function is best for protecting passwords, hash function or encryption?
Greg: Good question. The answer is still hash function but for a different reason. Can you guess what it is? It’s because decryption — the process that makes an unreadable text readable — is a bad idea that is asking for trouble. This article correctly points out that “it’s more secure to store the hash values of passwords instead. When a user enters a password, the hash value is calculated and then compared with the table. If it matches one of the saved hashes, it’s a valid password and the user can be permitted access.”
Joy: Speaking of passwords, I’ve read an interesting story that is almost funny: Turns out if you limit the passwords to a four-digit number, the choices made by many people are highly predictable. I still remember that the top choice is “1234,” followed by “1111” and “0000.” Those made the “top three” list.
Lily: So is encryption any good at all for cryptocurrency? It sounds like hash function is all we talk about.
Greg: That’s a question for another day.