Living in a state of accord.

Ethereum Merge Local Testnet Demo

The merge is at the point where real interop testing can begin with testnets spun up involving multiple clients. This video shows a testnet running locally with four nodes:

  • Teku & Geth
  • Teku & Besu
  • Lighthouse & Geth
  • Lighthouse & Besu

Each node has 1/4 of the validators and the network transitioned through from the initial phase0, to Altair and then then merge after which the two chains merge into one.  Also shows sending transactions with MetaMask post-merge and how the user experience is largely unchained. Ethereum keeps on being Ethereum but now with PoS instead of PoW.

What Happens If Beacon Chain Consensus Fails?

Over on Twitter, there was an interesting discussion about the importance of client diversity for the beacon chain. As part of that cyber_hokie asks:

cyber_hokie – @cyber_hokie: As the majority client, isn’t it more likely that issues with Prysm would cause more severe financial penalties for minority clients voting on an alternative chain under accountable safety given Prysm validators likely hold the 2/3 threshold?

It’s important to note up front that this isn’t a Prysm specific issue. This isn’t a criticism of Prysm at all. All the major beacon chain clients are great – the issue here is any one client getting too large a share of validators. How much is too large? In an ideal world every client would have less than 1/3rd of validators. Then a bug in a single client doesn’t prevent the chain from continuing to finalise. More than 2/3rds majority is a really big problem because of the potential for those clients to finalise on an incorrect fork.

As Dankrad explains, if a client with 2/3rd of validators has a consensus affecting bug, it will fork off to its own chain and finalise that chain. That becomes an unrecoverable situation for those validators. If they fix the bug and switch back to the correct chain they will be slashed for creating surround votes – the correct chain’s justified checkpoint is earlier than the incorrect one. The only option for them is to send a voluntary exit and watch their balance drain until it takes effect. Given 2/3rds of validators are trying to exit, the exit queue is going to be extremely long and thus costly for majority client validators.

Which brings us to cyber_hokie’s question. If 2/3rds of validators are on the incorrect chain and they can’t switch, why wouldn’t we just accept that chain as the canonical one? Minority client users would then take the hit for inactivity on the majority chain but could switch over. Sounds simple and obvious right? Sadly, it’s not.

Firstly, again as Dankrad points out, we’ve been very clear that it’s bad for the network for clients to have more than 1/3rd share and dangerous for validators to centralise on a client or provider that has such large share. Most validator penalties ramp up as more validators have the same issue, so if you use a majority client or a majority cloud provider or a majority staking service you run the risk of incurring large penalties if they have an issue, due to it affecting so many validators at once. It would be extremely unfair to validators who heeded these warnings to then be penalised for doing the right thing.

Secondly, if you remember back to hard truths for ETH stakers, Ethereum doesn’t exist to make stakers rich. Stakers are service providers to Ethereum, highly interchangeable and largely quite replaceable. Stakers are not the target market for Ethereum or its reason to exist. The service they provide is necessary and they are paid for that service but they shouldn’t expect to be treated any differently to a miner and definitely shouldn’t expect the Ethereum community to bail them out when they mess up.

Which leads to the third and most important point. A client with a majority of validators, doesn’t necessarily have a majority of users or “value”. Even on the beacon chain today there are already some applications following the chain and making decisions based on that state. Those users don’t necessarily show up in validator numbers and so could all follow the chain with a minority of validators. So while a chain may be a majority from a consensus point of view it may simultaneously be the minority chain from a network value point of view.

Once we get to the merge, it becomes much more clear that the weight of DApps, exchanges and other users is going to be far more important than the number of validators. Unless you’re willing to argue that we should accept whatever chain Infura follows as canonical today (and you shouldn’t be), you shouldn’t argue that we should accept whatever chain a majority consensus client follows.

Finally, accepting the incorrect chain as canonical would mean embedding whatever bug happened to occur as the expected behaviour. That creates significant technical debt and may even introduce security flaws into the specification. It may not even be fully deterministic given that many consensus bugs have been caused by incorrect caching behaviour so nodes follow different chains depending on whether they were online or offline at a particular point in time or based on when they got a particular network message.

And that’s before we start on the political/governance challenges of a bail out. The DAO fork didn’t cost normal users anything and it still caused a chain split.

The days of Ethereum bailing people out are long gone.

Exploring Eth2 – Why Open Ports Matter

One of the most commonly overlooked steps when setting up a beacon node is to setup port forwarding on your router to ensure that other nodes can connect in to yours.  This is often overlooked, or not done correctly, because there are so many different routers, all with different, fiddly interfaces that providing simple instructions is essentially impossible and because your node will generally work even without it by connecting outbound to other peers even though none can connect inbound.

Unfortunately, this means a lot of nodes on the beacon chain don’t accept incoming connections and that’s bad for the health of the network in two ways:

  1. Reduced efficiency
  2. Increased risk of censorship

The efficiency of the beacon chain’s gossip network is reduced because two nodes with closed ports can never connect to each other. So as the percentage of nodes with open ports drops, the network gradually morphs from a mesh topology to a star topology.  Gossip messages then have to route through a relatively small number of “super nodes” (those with open ports) which then forward it on to the nodes. It works, but messages propagate slower.

It’s also much harder for nodes to join the network because they need to find nodes that have open ports to connect to, and most of those are already at their peer limit so are rejecting connections. That causes a lot more discovery traffic and connection attempts than necessary.

The move towards a star topology also increases the risk of censorship. With messages having to route through a smaller number of nodes, censorships only need to target those nodes with open ports rather than needing to disrupt communications of all nodes on the network. The fewer nodes with open ports, the easier such attacks are.

So if you’re running a beacon node, ensure that your ports are correctly forwarded and remember that both TCP and UDP need to be forwarded to correctly allow incoming connections.  While there are only tools to check TCP connectivity, I don’t know of any that will correctly check UDP connectivity because the discovery system deliberately won’t reply to invalid packets. Since most UDP tests just send a random packet and report success if it isn’t explicitly rejected, they don’t confirm that the packet actually reached the discovery system and will pass even if the packet was silently dropped by a firewall which is very common.

The most definitive way to check your port forwarding is working is to check that you have a mix of incoming and outgoing peer connections. For clients supporting the standard REST API (all of them now I think 🎉) you can run:

curl http:/localhost:5051/eth/v1/node/peers | jq

You should see at least some peers reported with "direction": "inbound". If not, your port forwarding is not working correctly and peers can’t initiate connections to your node.


Why Miners Can Be Simultaneously Paid Too Much and Struggling to Survive

Note: This post is deliberately high level. It doesn’t attempt to be an economic proof or go into the gory details of exactly how the difficulty adjustment works. The aim is just to see the high level effects of how the system pans out without getting lost in the nitty gritty details of what is ultimately a very complex system.

Pretty much every time the word “miner” is mentioned in an Ethereum discussion someone will claim that miners are paid too much and miners will respond saying they’re struggling to survive. Turns out, both can be simultaneously true and in fact it’s pretty much the expected case.

The reason lies ultimately in the way the difficulty automatically adjusts. Without going into too much detail, Ethereum maintains a (very) roughly consistent block time by making it easier or harder to find the next block depending on whether the latest block was found too quickly or too slowly.

The other side of that equation is the total hash power. When there is a lot of hash power being used to find the next block, it will generally be found faster and when there’s less it will be found slower. Net result, as there’s more hash power, it gets harder to find the next block.

Why Miners Are Always Struggling

So even when a miner has a consistent amount of mining power, when the total hash power increases the rate that they find blocks (ie the rate they get paid) will reduce. Similarly if the total hash power decreases, the rate they find blocks and get paid will increase. When mining is profitable, more people will start mining to get a share of those sweet, sweet block rewards. That increases the total hash power and each individual miner winds up earning less.

That process will continue until there’s about the same amount of hash power being added by people investing in mining as there is hash power being turned off because it’s just not worth it. That balance appears to be just a little bit above break even, so individual miners will always wind up only just making a profit.

Why Miners Are Paid Too Much

Since the difficulty adjustments are aiming to have, on average, blocks a consistent distance apart each day about the same number of blocks are created regardless of how much hash power is thrown at the problem. So roughly the same amount of new ETH is created and paid out as block rewards each day. While this ETH is created out of nowhere it is effectively paid for by all ETH holders because increased issuance puts downward pressure on prices.

There’s only so much hash power required to keep the Ethereum chain secure. People will argue about how much that is but the exact number isn’t important here. So if the total hash power is more than that threshold we could in theory reduce the total amount paid to miners. That wound result in miners earning less both individually and in aggregate which will make some of them unprofitable and they’ll stop mining. That will then reduce the hash rate (which is what we wanted because we didn’t need that much) and each individual miner will become more profitable again. Once the equilibrium is found again there’ll be less paid out in total to miners but each individual miner will wind up earning about as much as they did before the change.  The opposite process can be used if more hash power is required – increasing the block reward would temporarily make individual miners more profitable, but that would incentivise the addition of more hash power until miners are individually about as profitable as they were before.

Transaction fees and the price of ETH in various fiat currencies are other variables that affect how much miners are paid. They can add a lot of variability, but the process is still essentially the same.  Having higher transaction fees or a higher price of ETH is just like increasing the block reward, just much less controllable or predictable.

Since it’s hard to determine exactly how much hash power is required to secure the chain, and it’s better to err on the side of more hash power, the typical case is that miners will be overpaid.  We could theoretically pay less and still have a secure chain but the variability in the price of ETH and uncertainty in exactly how much hash power is enough means reducing the total payments to miners is too difficult and we just accept some amount of over payment.

Resolving The Oxymoron

Hopefully you’ve seen the key distinction here that makes it logical that miners are both paid too much and are struggling to survive.  They’re paid too much in aggregate and struggling to survive individually.

Miners quite understandably are very focussed on their individual profitability, but as we’ve seen it doesn’t matter how much is paid in total to miners, the hash rate will just adjust back to the equilibrium where they’re struggling to survive again. To change that we’d have to continuously increase block rewards, trying to stay ahead of the new hash power that would attract. But even if we could stay ahead, the massive inflation would destroy the price of ETH so the cost of electricity and mining equipment relative to ETH would spike and miners may well end up worse off overall.

ETH holders on the other hand are concerned about the amount paid to miners in aggregate and whether that’s buying us too little, the right amount or too much hash power. That doesn’t mean they don’t care about individual miners, just that as explained above, paying more in total won’t make individual miners more profitable in the long run anyway.

The Complication of Lag and Averages

In all this, there’s a lot of mentions of “on average” or “eventually”. That’s because while there are clear economic functions at play, there is also a lot of probability and lag in finding the new equilibrium. It takes time for people to decide to invest in mining and to order the components required, so if the block reward doubled tomorrow, the hash rate wouldn’t suddenly double.  For a while miners would in fact make a lot more money but it would gradually reduce as the new hash power came online until the equilibrium is found. Similarly, if the block reward is halved the hash power doesn’t suddenly drop – miners wind up earning less for a while until the equilibrium is found again.

Throw in the variability from changing ETH prices and transaction fees and it can take even longer to find that equilibrium because the uncertainty causes people to put off new investment to see if high prices last or continue mining while unprofitable hoping the price or transaction fees will come back up.

While that lag does affect some details and needs to be factored into various decisions, it doesn’t change the fundamental economics. The equilibrium will wind up being found, even if it takes a long time and miners will be back to that average amount of individual income – likely just managing to be profitable.