Symphonious

Living in a state of accord.

Fun with Java Backwards Compatibility

There’s a fun little gotcha introduced in Java 10 which causes trouble for anyone wanting to support earlier JVMs.  If you take the code:

import java.nio.ByteBuffer;
public class ProblemCode {
  public static void main(String[] args) {
    final ByteBuffer buffer = ByteBuffer.allocate(10);
    buffer.position(10);
    System.out.println("Yay it worked!");
  }
}

If you compile it on Java 8 it will run on any Java from 8 and above. If you compile it on Java 10 or above it will only work on Java 10 and above, even if you specify -target 8 -source 8.

The problem is that you’re compiling against the Java 10 libraries and Java 10 changed the ByteBuffer.position return type. Previously it was inherited from Buffer and so returned a Buffer but Java 10 took advantage of co-variant return types and now returns ByteBuffer so it’s easier to use in a fluent style. Compiled against Java 10 libraries the compiler embeds a reference to the Java 10 version which results in a NoSuchMethodError on Java 8:

Exception in thread "main" java.lang.NoSuchMethodError: java.nio.ByteBuffer.position(I)Ljava/nio/ByteBuffer;
at ProblemCode.main(ProblemCode.java:6)

Javac helpfully provides a warning which suggest how to solve the issue:

warning: [options] bootstrap class path not set in conjunction with -source 8

which has been best practice for many, many years anyway but it’s extremely common to skip that step because going and finding the libraries from Java 8 is a challenge and they believe they can make things work just by avoiding new APIs.  Often that’s tested by periodically checking the code compiles on Java 8 as well.

This case shows that’s not enough. It’s possible to have code that compiles correctly on Java 8 and 10 but is not backwards compatible when compiled against Java 10 libraries.

Ethereum State Rent Proof of Concept

I’ve had the opportunity to do some proof of concept development of the Ethereum state-rent proposal that Alexey Akhunov has been leading on the Pantheon code base. The proposal evolved as the work continued so the actual implementation is now a lot simpler than described in that PDF.

Note that the aim is to explore what it takes to implement the proposal, not to create production ready code.  The current work is all available on on my state-rent branch.

The easiest way to begin exploring the changes required to implement state-rent in Pantheon is to start with the hard forks required which translate to new ProtocolSpecs in Pantheon which are defined in MainnetProtocolSpecs. Ultimately there should be four hard forks:

  1. Introduce replay protection for accounts that are evicted
  2. Charge fixed state rent for “owned” accounts
  3. Charge state rent for newly allocated storage
  4. Charge state rent for pre-existing storage

Fork 1: Replay Protection

When accounts are evicted, their nonce would reset to 0, potentially enabled old transactions to be replayed. There are a few ways to prevent this, my preference was to introduce temporal production (variant 2 in the PDF) which is a simple extra validation to perform before processing transactions.  As such, I skipped implementing it in the proof of concept.

Fork 2: State Rent for Owned Accounts

In this milestone “owned” accounts are charged a fixed rent per block. An “owned” account is defined as one without code in the PoC. To avoid having to process every account on every block, the rent is only recalculated when the account is changed at the end of the block (ie a change would already have to be written to disk for it).

TODO: Formalise the definition of owned account more. Particularly, clarify if an account created with empty code is considered owned or not.

The ProtocolSpec for this fork is defined based on Constantinople with three changes:

public static ProtocolSpecBuilder<Void> stateRentOwnedAccountsDefinition(
      final int chainId, final long rentEnabledBlockNumber) {
    return constantinopleDefinition(chainId)
        .rentCost(Wei.fromGwei(2))
        .rentProcessor(
            rentCost -> new OwnedAccountsStateRentProcessor(rentCost, rentEnabledBlockNumber))
        .accountInit(StateRentOwnedAccountsAccountInit::new)
        .name("StateRentOwnedAccounts");
  }

Firstly, we set a rent cost of 2 Gwei/block. This is a pretty arbitrary figure and will likely need tweaking. This value is used by the rent processor we also set but has been split out so it’s easy for a hypothetical future fork to change the rent cost without having to implement a completely different rent processor.

Finally, we set a StateRentOwnedAccountsAccountInit which is used to setup defaults for any newly created accounts (post-fork). In this case, it sets the rent balance to zero and rent block to the current block number. These are two new fields added to account state in this fork. This is the first time the new account process has varied in a hard fork, so there was a bunch of plumbing code required to add the concept of a pluggable AccountInit but none of it difficult.

However, it’s also the first hard fork that adds fields to account state, which makes serialising and deserialising them a little more difficult.  That code is in AccountSerializer. Fortunately it’s easy in RLP to check if you are at the end of the current list or not, so when deserialising we can tell if rent balance and rent block have been added to this account state by checking if we’re at the end of the list (line 51).

Calculating the rent owed is handled by OwnedAccountsStateRentProcessor with much of the code able to be reused with later forks via AbstractRentProcessor. The rent balance is one of the very few things in Ethereum that can be negative which makes life a little frustrating as the existing Wei class used for account balance can’t be reused and there’s a few points where things need to be converted between Wei and BigInteger but it’s manageable.

Rent is only recalculated when the account has been changed in the block. In Pantheon the simplest way to achieve this was to piggy-back on the concept of “touched” accounts used in Spurious Dragon.  We simply iterate through the touched accounts and check if the root of their account state trie is dirty (which is also already tracked), if it is we recalculate rent via the RentProcessor.

At this stage, any account which is unable to pay its rent can be deleted in the same way it would have been if it were empty. Again, simple to reuse existing code for this.

For the record, actually calling the rent processor is hooked into MainnetBlockProcessor so that it happens as the last step in processing a block.

TODO: Additional clarification is required about exactly when rent should be charged. In the PoC it’s applied even after paying the miner which seems sensible but there are a number of things happening at the end of a block so it needs to be completely clear.

Note that the priority queue for eviction has been scrapped (I forget why it was originally required…).

Fork 3: State Rent for New Storage

This fork introduces quite a bit of new stuff, but most importantly adds a new storagesize field to account state which is adjusted as part of every SSTORE call. Existing storage is not included in the storagesize field at this point, only a count of the storage size either added or cleared since fork 3 became active. The field is added when new accounts are created, when rent is recalculated for an existing account or when an SSTORE call results in the account storage size changing.

TODO: Need to be clear about exactly when this field is first added (particularly if multiple SSTORE calls are made that leave the storage at the same size or even completely unchanged).

The ProtocolSpec for this fork is based on the one for fork 2 but with a few changes:

  public static ProtocolSpecBuilder<Void> stateRentNewStorageDefinition(
      final int chainId, final long rentEnabledBlockNumber) {
    return stateRentOwnedAccountsDefinition(chainId, rentEnabledBlockNumber)
        .gasCalculator(StateRentNewStorageGasCalculator::new)
        .evmBuilder(MainnetEvmRegistries::stateRentNewStorage)
        .rentCost(Wei.fromGwei(1))
        .rentProcessor(
            rentCost -> new StorageSizeStateRentProcessor(rentCost, rentEnabledBlockNumber))
        .accountInit(StateRentNewStorageAccountInit::new)
        .name("StateRentNewStorage");
  }

There’s a new AccountInit to set storagesize to 0 when a new account is created.

Rent cost is reduced to 1 Gwei since it’s now charged per byte per block. This is almost certainly still the wrong value but it definitely should be less than when rent was purely per block.

We also supply a new RentProcessor, StorageSizeStateRentProcessor, which applies to all accounts whether they have code or not. It also charges rent based on the value of the new storagesize field.  If storagesize hasn’t yet been set, it’s set to a fixed value to represent the size of an empty account (73 which is the length of an empty account RLP in bytes).

TODO: The PoC deliberately uses the value of storagesize from the start of the block, ignoring any changes applied in the current block. Otherwise, if an account is untouched for 100 blocks, then increases its storage size, it will be charged rent at the increased storage size for the whole 100 blocks it was untouched.  This rule may need to be tweaked slightly to use the new storagesize value for the current block and the old one for the untouched blocks but it seems reasonable to defer rent changes until the block after they take effect – increased storage would have cost gas in the current block and reduced storage doesn’t really give a benefit until the block is complete and committed anyway.

NOTE: storagesize may be negative during this fork, indicating that the account now uses less storage than it did when this fork first happened.  After fork 5, storagesize should only ever be positive. 

TODO: Rent calculation during this fork should probably use max(storagesize, 0) to avoid the rent due being negative.

Accounts with code are not deleted when they can’t pay their rent, they are instead “evicted” and replaced with a hash stub so they can be restored via the new RESTORETO operation (see below). This is one of the more complicated changes to work through the code – firstly just in terms of tracking evicted accounts through the transaction commit/rollback code but also in terms of detecting them in the account state trie.

Hash Stubs

Challenging bit: Determining if an address in the account state trie is an actual account state or just a hash stub.

Looking again at AccountSerializer in the serializeHashStub method, we can see that a hash stub is an RLP list containing the code hash and storage root hash. RLP doesn’t provide any typing information though so to detect a hash stub, we check if the first item is a nonce or a hash. If it is a hash stub, the size of the RLP item must be 32 bytes. Theoretically a nonce is an arbitrary unsigned scalar number, so it could wind up being 32 bytes (a 256bit number) but at least in Pantheon we already don’t support nonces that big and you’d have to spend a lot of eth on gas to get a nonce to make a nonce that big. So the item length appears to be a suitable way to distinguish between account state and hash stubs.

UPDATE: A smarter way to do this would have been to just count the number of items in the RLP list which is relatively straight-forward. If there are 2 items, it must be a hash stub.

TODO: Clarify interactions for any operation that looks up an account state if the account state has been replaced by a hash stub. Currently Pantheon treats hash stubs as if they don’t exist except in the RESTORETO operation. This is mostly right, except that contract creation should consider it an address conflict if a hash stub is at the target address (not currently implemented but pretty straight forward).

New EVM Operations

There are three EVM operations added in this fork: PAYRENT, RENTBALANCE, SSIZE and RESTORETO. Additionally SSTORE is updated to update the storagesize field. PAYRENT, RENTBALANCE and the SSTORE changes are pretty trivial but RESTORETO has quite a few complexities.

TODO: Decide on the gas cost for these new opcodes.

Tool Support: RENTBALANCE returns an Int256 (signed) which I think is the only time an EVM operation returns a signed number. Need to double check compatibility for this with things like Solidity but should be fine.

Restore To Operation

The RESTORETO operation turns out to be one of the more interesting parts of this spec to implement.  The basic workflow is slightly nuanced and involves three accounts.  Let’s say that Account A is evicted, leaving behind a stub with its code and storage root hashes.  To restore Account A:

  1. Account B is created with the same code as Account A.
  2. Account C is created with code that sets up the storage in Account C to precisely match the storage Account A had before being evicted. It may do this in it’s constructor or expose functions that can be called over time to build up the correct state from multiple sources (potentially sharing the gas cost of rebuilding that state across multiple people).
  3. Account C calls RESTORETO <address of Account A> <address of Account B>.  The EVM checks that account B’s code hash matches what Account A’s stub records, and that Account C’s storage root hash matches what Account A’s stub records.  If so, it creates a new contract at Address A, replacing the hash stub, with the code from Account B and the storage from Account C. It also transfers any balance from Account C to Account A and destroys Account C (just as SELFDESTRUCT would).

Mostly this is fairly straight-forward to implement as seen in RestoreToOperation.java. The hidden catch is that most Ethereum clients, including Pantheon, don’t update the storage trie until the transaction actually commits – the changes made mid-transaction are stored in a simple HashMap which is much faster. As such, getting the storage root hash for Account C isn’t straight-forward. In fact, in the PoC I haven’t implemented it at all. It’s certainly possible to create a copy of the unmodified Trie, update that with pending changes and calculate the storage root but it’s an expensive operation and that code would only be exercised by RESTORETO, significantly increasing the testing required to fully cover RESTORETO.

TODO: Find a way to change RESTORETO so that the storage root is more easily available when required.

One option here would be to have execution immediately halt when RESTORETO is called, and then to perform the actual restore at the end of the transaction, much like how SELFDESTRUCT only applies at the end of the transaction.  Geth and Pantheon currently update the storage trie after every transaction (they don’t necessarily calculate the root hash but that’s simple once you have the trie), but TurboGeth delays updating the trie until the end of the block, improving performance if the same account is used from multiple transactions in one block.  Implementing that improvement for Pantheon is also planned.  Delaying application of RESTORETO until the end of the block feels a bit too weird though.

More thought is required in this area to find a solution that both makes sense to users and can be implemented without significantly increasing the surface area required for testing.

Fork 4: State Rent for Existing Storage

The PoC doesn’t include any of this fork currently, but it’s fairly straight-forward. When an account’s rentblock field is first updated to a block after fork 4 is in effect, the account’s storagesize field is increased by the size of its storage immediately prior to fork 3 coming into effect. Basically, we add the historical state size in.

This is done as a separate fork because it gives times for client developers to precompute the storage size for every account on the various public networks at the fork 3 block and include that in a client update. This means the cost of calculating that existing storage size is done offline and doesn’t impact on performance of every node.

However, because the storagesize field is only updated when the account is first touched anyway, it is still feasible to calculate the existing storage size on the fly, either because the precomputed values aren’t available for that particular network or in order to verify them.

TODO: It’s not explicitly stated anywhere at the moment, but when calculating the rent due for the first time after fork 4, rent should be calculated up to fork 4 block using the new-storage-only-value, then add in the existing storage size and calculate rent from fork 4 block up to the current block. Otherwise accounts start being charge rent for existing storage at different times which is too hard to understand (and unfair).

Other Work Required

There’s some other stuff that would need to be done to make state-rent production ready:

  • Additional JSON-RPC methods such as getRentBalance and probably something to calculate the rent owing for a given account at a specific block.
  • CALLFEE opcode. The spec refers to this to provide a way for contracts to require a rent contribution when users call into it, but there weren’t enough details to actually implement it. The main question is where the extra fee is taken from (it can’t be an extra gas charge because rent is tracked in ETH not gas).
  • Currently if you deploy a contract with Remix it doesn’t get any eth balance so it is immediately evicted when you interact with it. This is quite confusing and frustrating as a user.  We’ll either need to update tools to allow sending an initial rent balance, provide some free rent on contract creation or allow a grace period after contract creation before rent is charged.

Introducing Pantheon

This week, the work I’ve been doing for the past 6 months, and that PegaSys has been working on for the past 18 months or so was released into the world. Specifically we’ve released Pantheon 0.8.1, our new MainNet compatible, Apache 2 licensed, Java-based Ethereum client. And it’s open source.

I’m pretty excited about it on a few fronts.  Firstly I think it’s a pretty important thing for the Ethereum community. To be a healthy ecosystem, Ethereum needs to have diversity in its clients to avoid a bug in one client taking out or accidentally hard forking the entire network. Currently though, Geth and Parity dominate the Ethereum client landscape.  Pantheon clearly won’t change that in the short term, but it is backed by significant engineering resources to help it keep up with the ever changing Ethereum landscape and be a dependable option.

I’m also really excited that Pantheon is released under the Apache 2.0 license. Both Parity and Geth along with most other clients are licensed under the GPL or LGPL. There are still a large number of enterprises that completely avoid the GPL and LGPL which has closed off Ethereum to them. Having Pantheon available under a permissive license and in a highly familiar language like Java will make it much easier for many enterprises to start using, building on and innovating with Ethereum.

Pantheon will also be building out functionality from the Enterprise Ethereum standard for things like privacy and permissioning to make private chains more powerful and flexible. Meanwhile we have a significant number of researches continuing to work on developing new ways to get the most out of Ethereum.

Finally I’m quite excited to be able to contribute to an open source project as my full time job. I’ve had the opportunity to do some open source for work in the past but only on a fairly small scale. It’s a bit daunting to have everything in the open and on the record, but I’m really looking forward to engaging with the community and being able to show exactly what I’ve been doing.

Debugging Ethereum Reference Tests

There’s an exceptionally valuable set of ethereum reference tests that are run by  most or all of the different Ethereum clients to ensure they actually implement the specifications in a compatible way.  They’re one of the most valuable resources for anyone developing an Ethereum client.

The Aleth project maintains the official test client called testeth but it’s a little cryptic to work out how to actually run things with it and then use that to debug failures happening in the client you’re actually developing.  So this is what I’ve found useful:

Firstly, you need to checkout Aleth and compile it following their instructions. The one catch here is that you need to enable EVM tracing by adding the -D VMTRACE=1 flag when doing the initial cmake .. Once the whole build is finished you should find testeth in aleth/build/test/testeth Copy that to somewhere on your PATH.

Then you need to check out the actual reference tests and set ETHEREUM_TEST_PATH to point to where you checked it out.

Now you should be able to start running tests.  Running a single test with loads of debug looks like:

./testeth -t GeneralStateTests/stCreate2 -- --singletest create2collisionStorage --verbosity 3 --vmtrace --statediff | less -R

The param after -t needs to point to a directory containing the group of tests you want to run.  It could be as high level as GenerateStateTests or BlockchainTests or point to one of the directories under that.  The –singletest option lets you run just one file underneath that directory. The name is the filename without the .json extension.

–verbosity 3 –vmtrace –statediff turns on pretty comprehensive debugging information so you get a full EVM trace, plus a record of the changes to state that were made (it appears to show each change as well as the final state though I’m not sure I’ve fully understood the format yet).

And then finally piping to less -R allows you to scroll up and down and search through the output easily.  The -R tells less to render terminal control characters so you still get the pretty colours.

Now you have a reference client running the test and providing a ton of information about what the test expects to happen, go run the test in the client your developing, compare outputs, beat your head against a wall for a while and eventually find and fix your bug.

And a massive thanks to everyone who’s been involved in building up these reference tests. If you ever have to pay for your own beer at DevCon the community is doing something very wrong.

Exploring Ethereum – Account and Transaction Nonce

This is the second article on things I found particularly interesting in the Ethereum yellow paper.  The first is “What’s on the Blockchain?” and the same disclaimers apply: I’m no expert and you should go verify any claims I’m making before depending on them. Comments and corrections are most welcome either via email or @ajsutton on twitter.

One of the little details in the way Ethereum works is the idea of a “nonce” attached to each account and transaction. It’s a small but important detail.

For a “normal” account (ie has no code attached), the nonce is equal to the number of transaction sent from it. In the case of contracts (accounts with code) the nonce is the number of contract-creations made by the account.

When a transaction is created, the current nonce value from the account is assigned as the transaction nonce. Part of the initial tests for intrinsic transaction validity is that the transaction nonce is equivalent to the sender account’s current nonce.

The nonce is primarily included in transactions to prevent same-chain replay attacks on transactions. The transaction sender is identified by the signature they add to the transaction (those v, r and s items from each transaction we skipped over last time). To generate those you need the account’s private key so only the account owner can create a validly signed new transaction.

However, if the transaction data and the sender are the same, the signature will also be the same. So absent the nonce, an adversary could take any existing transaction and resend it to a node with a valid signature and have it processed a second time. For example, if Alice signed a transaction to send 10ETH to Bob, Bob could take that transaction signed by Alice and repeatedly submit it for processing until all of Alice’s funds had been transferred to Bob. Bob couldn’t change anything about the transaction but that doesn’t make Alice feel any better about losing all her ETH when she only approved a single transfer.

With the account nonce however, when the transaction is first processed, Alice’s account’s nonce is incremented and then when Bob resubmits the transaction, it is rejected because the nonce doesn’t match. Bob is unable to change the nonce on the transaction without invalidating Alice’s signature so the transaction can only be applied once, exactly as Alice intended.

BUT! This doesn’t entirely eliminate replay attacks.  The transaction could still be replayed on a different chain (though it may require replaying a number of transactions so the account nonce “catches up”).  The Ethereum / Ethereum Classic split caused quite a few headaches in this regard, until EIP-155 was implemented to include an ID for the chain in the data to sign, thus making the two different chains incompatible. The same problem can also occur between test chains and MainNet, though hopefully you aren’t sharing a single private key between them.

Interestingly, most explanations for the importance of the nonce suggest it’s there to prevent double spending which is not the case. The theory goes that Alice sends transaction t1 to pay Bob for some goods but then very quickly submits another transaction t2 with a higher gas price which is then prioritised higher and mined first allowing her to spend funds twice. Even if t2 was processed before t1, it would result in Alice’s balance being reduced before t1 was applied. If there were then insufficient funds t1 would be rejected. If you had already released the goods t1 was intended to pay for that might be bad, which is why typically people wait for the transaction to be in a block at a certain depth before considering it finalised. The nonce doesn’t help prevent this double-spend issue at all – Alice could deliberately setup the same race by giving both t1 and t2 the same nonce.

Finally, the account nonce is also used as part of creating the address for a new account/contract. The address of the new account is “the rightmost 160 bits of the Keccak hash of the RLP encoding of the structure containing only the sender and the account nonce”. Which is to say, the new address is a particular way of hashing the combination of the sender’s account hash and nonce. Since the sender account nonce is incremented when sending a new transaction this is guaranteed to generate a unique address.