Symphonious

Living in a state of accord.

Exploring Eth2: Attestation Rewards and Validator Performance

Through the beacon chain testnets people have been gradually developing better ways to evaluate how well their validators are working. Jim McDonald made a big leap forward defining attestation effectiveness which focuses on the inclusion distance for attestations. That revealed a bunch of opportunities for clients to improve the attestation inclusion distance and attestation effectiveness scores improved quite dramatically until now scores are generally at or close to 100% pretty consistently.

But then beaconcha.in added a view of how much was earned for each attestation. Suddenly people were left wondering why some of their “rewards” were negative even though the attestation was included quickly. It became obvious that inclusion distance was only one part of attestation rewards.  

How Attestation Rewards Work

There are four parts in total:

  • Reward based on inclusion distance
  • Reward or penalty for getting the source correct
  • Reward or penalty for getting the target correct
  • Reward or penalty for getting the head correct

It turns out that inclusion distance is the smaller of these components.  Ben Edgington explains all the detail very well in his annotated spec. First we calculate the base reward, which is the component which factors in the validator’s effective balance (lower rewards if your effective balance is less than the maximum 32ETH) and the total staked ETH (reducing rewards as the number of validators increase).

Then for the source, target and head attestations, a reward or penalty is applied depending on whether the attestation gets them right.  A missed attestation is penalised for all three. The penalty is 1 * base reward. The reward however factors in a protection against discouragement attacks so is actually base reward * percentage of eth that attested correctly. This provides incentive for your validator to do whatever it can, like relaying gossip well, to help everyone stay in sync well.

Finally the inclusion distance reward is added. There’s no penalty associated with this, missed attestations just don’t get the reward. 1/8th of this reward is actually given to the block proposer as reward for including the attestation, so the maximum available reward for the attester is 7/8th of the base reward. So it winds up being (7/8 * base reward) / inclusion distance.

So inclusion distance is actually the smallest component of the reward – not only is it only 7/8th of base reward, there’s no penalty attached. The most important is actually getting source right as attestations with incorrect source can’t be included at all, resulting in a 3*base reward penalty. Fortunately getting source right is also the easiest because it’s back at the justified checkpoint so quite hard to get wrong.

Evaluating Validators

If we’re trying to evaluate how well our validator is doing, these calculations add quite a lot of noise that makes it difficult. Especially if we want to know if our validator is doing better this week than it was last week.  If you just look at the change in validator balance, how many block proposals you were randomly assigned dominates and even just looking at rewards for attestations is affected by the number of validators and how well other people are doing.

I’d propose that we really want to measure what percentage of available awards were earned. That gives us a nice simple percentage and is highly comparable across different validators and time periods.

The first step is to think in terms of the number of base rewards earned which eliminates variance from the validator’s balance and total eth staked.  Then we can ignore the discouragement attack factor and say that each of source, target and head results in either plus or minus 1 base reward. Inclusion distance is up to 7/8th of a base reward, scaling with distance like normal. Like for attestation effectiveness, we’d ideally also use the optimal inclusion distance – only counting the distance from the first block after the attestation slot to avoid penalising the validator for empty slots they couldn’t control. In practice this doesn’t make a big difference on MainNet as there aren’t too many missed slots.

So each attestation duty can wind up scoring between -3 and +3.875 (3 and 7/8ths). For any missed attestation, the score is -3. For any included attestation, we calculate the 4 components of the score with:

Inclusion distance score: (0.875 / optimal_inclusion_distance)

Source score: 1 (must be correct)

Target and head scores: 1 if correct, -1 if incorrect

And to get a combined score, we need to add them together.

We can actually do this with the metrics available in Teku today:

(
validator_performance_correct_head_block_count - (validator_performance_expected_attestations - validator_performance_correct_head_block_count) +

validator_performance_correct_target_count - (validator_performance_expected_attestations - validator_performance_correct_target_count) +

validator_performance_included_attestations - (validator_performance_expected_attestations - validator_performance_included_attestations) +

(0.875 * validator_performance_included_attestations / validator_performance_inclusion_distance_average)
) / validator_performance_expected_attestations / 3.875

Which is essentially :

(
(correct_head_count - incorrect_head_count +
correct_target_count - incorrect_target_count +
included_attestation_count - missed_attestation_count
) / expected_attestation_count +
0.875 / inclusion_distance_average
) / 3.875

To give a score between 0 and 1 pretty closely approximating the percentage of possible attestation rewards that were earned. With the discouragement attack factor ignored, we are slightly overvaluing the source, target and head rewards but it’s very minimal. On MainNet currently they should be +0.99 when correct instead of +1 so doesn’t seem worth worrying about.

You can also calculate this fairly well for any validator using a chaind database with a very long and slow SQL query.

Evaluating the Results

Looking at the numbers for a bunch of validators on MainNet, generally validators are scoring in the high 90% range with scores over 99% being common but over anything more than a day it’s hard to find a validator with 100% which generally matches what we’d expect given the high participation rates of MainNet but knowing that some blocks do turn up late which will likely result in at least the head being wrong.

One key piece of randomness that’s still creeping into these scores though are that its significantly more likely to get the target wrong if you’re attesting to the first slot of the epoch – because then target and head are the same and those first blocks are tending to come quite late.  There are spec changes coming which should solve this but it is still affecting results at the moment.

What About Block Rewards?

For now I’m just looking at what percentage of blocks are successfully proposed when scheduled (should be 100%).  There are a lot of factors that affect the maximum reward available for blocks – while blocks aren’t reaching the limit on number of attestations that can be included, I’d expect that all the variation in rewards comes down to luck.  Definitely an area that could use some further research though.

Exploring Eth2: Attestation Inclusion Rates with chaind

For the beacon chain MainNet, I’ve setup an instance of chaind – a handle little utility that connects to a beacon node (like Teku) and dumps out the chain data to a database so its easy to run various queries against. While you can do this with various REST API queries and custom code, having it all accessible by SQL makes ad-hoc queries a lot easier and lets you add it as a datasource for Grafana to display lots of powerful metrics.

Herewith, a few useful queries and snippets that make life easier. I’ve setup my dashboard with two variables – genesisTime (1606824023 for MainNet) and perfValidators – a list of validator indices that I often want to monitor (e.g. 1,1234,5533,2233).

Filter an epoch based table to the current Grafana time range:

WHERE f_epoch >= ($__unixEpochFrom() - $genesisTime) / 12 / 32 AND f_epoch <= ($__unixEpochTo() - $genesisTime) / 12 / 32

Filter a slot based table to the current Grafana time range:

WHERE f_slot >= ($__unixEpochFrom() - $genesisTime) / 12 AND f_slot <= ($__unixEpochTo() - $genesisTime) / 12

Average balance, in ETH, of a set of validators for the latest epoch:

SELECT
f_epoch * 32 * 12 + $genesisTime AS "time",
AVG(f_balance) / 1000000000.0 as "balance"
FROM t_validator_balances
WHERE
f_epoch >= ($__unixEpochFrom() - $genesisTime) / 12 / 32 AND f_epoch <= ($__unixEpochTo() - $genesisTime) / 12 / 32 AND
f_validator_index IN ($perfValidators)
GROUP BY f_epoch
ORDER BY f_epoch DESC
LIMIT 1

Balances by validator, by epoch suitable for graphing:

SELECT
CONCAT(f_validator_index, ' ') AS metric,
f_epoch * 32 * 12 + $genesisTime AS "time",
MAX(f_balance) / 1000000000.0 as "balance"
FROM t_validator_balances
WHERE
f_epoch >= ($__unixEpochFrom() - $genesisTime) / 12 / 32 AND f_epoch <= ($__unixEpochTo() - $genesisTime) / 12 / 32 AND
f_validator_index IN ($perfValidators)
GROUP BY f_epoch, f_validator_index
ORDER BY f_epoch

Calculate the percentage of blocks successfully produced and included on the canonical chain:

SELECT
SUM(1) FILTER (WHERE b.f_root IS NOT NULL) / COUNT(*) as "value"
FROM t_proposer_duties AS d
LEFT JOIN t_blocks AS b ON d.f_slot = d.f_slot
WHERE
d.f_slot >= ($__unixEpochFrom() - $genesisTime) / 12 AND d.f_slot <= ($__unixEpochTo() - $genesisTime) / 12 AND
d.f_validator_index IN ($perfValidators)

And then we come to attestations… They’re a special kind of fun because they can be aggregated and the validator is identified by its position in a committee rather than by its validator index directly. We’re going to build up to having a query that calculates the inclusion rate for attestations but we’ll take it a step at a time with a bunch of useful queries and snippets along the way.

Firstly, we need to deal with the fact that attestations for the committee may have been spread across multiple aggregates when included in blocks and recombine them into one set of aggregation bits that represents every validator from that slot and committee which was included in any attestation:

SELECT x.f_slot, x.f_committee_index, bit_or(RIGHT(x.f_aggregation_bits::text, -1)::varbit) AS f_aggregation_bits 
FROM t_attestations x
GROUP BY x.f_slot, x.f_committee_index

Note: you may want to add a WHERE clause to constrain this to the relevant time rage to improve performance.

The aggregation bits are stored as a bytea, so we first have to convert it to a varbit with a weird RIGHT(x.f_aggregation_bits::text, -1)::varbit because the internet said to and then we can do a bitwise or to combine them all.

Now we have slot, committee index and a long binary string indicating which validators in the committee attested. So we can identify the validators we’re interested in, we want to convert that binary string to an array of validator indices. I don’t know of a function that can do that in Postgres, but we can write one:

create or replace function get_indices(varbit, bigint[]) returns bigint[] as $$
declare i int;
l int;
s int;
c bigint[];
begin
c:='{}';
l:=0;
loop
exit when get_bit($1, bit_length($1) - 8 + l)=1;
l:=l+1;
end loop;
s:=(7 - l) + (octet_length($1) - 1) * 8;
c:=c||l::bigint||s::bigint;
for i in reverse s-1..0 loop
if get_bit($1, (i / 8) * 8 + 7-(i%8))=1 then
c:=c||$2[i + 1];
end if;
end loop;
return c;
end;
$$ language plpgsql;

UPDATE: The initial version of this function was incorrect. Endianness strikes again… The updated version fully parses the SSZ bitlist format, using the marker bit to calculate the exact list size.

This takes the binary string we had as the first argument (varbit) and the f_committee field from the t_beacon_committees table which is an array of validator indices. It iterates through each bit in the binary string and if it’s set adds the validator index from the same position in the beacon committee to the result.

Combining the two, we can get the list of validator indices that were included by slot and committee index:

SELECT c.f_slot, c.f_index, get_indices(a.f_aggregation_bits, c.f_committee)
FROM t_beacon_committees c
JOIN (SELECT x.f_slot, x.f_committee_index, bit_or(RIGHT(x.f_aggregation_bits::text, -1)::varbit) AS f_aggregation_bits
FROM t_attestations x
GROUP BY x.f_slot, x.f_committee_index) a ON a.f_slot = c.f_slot AND a.f_committee_index = c.f_index

If we just want to know how many validators attested from that committee and slot, we’d change the get_indices call to:

cardinality(get_indices(a.f_aggregation_bits, c.f_committee))

But we really only want to know how many of our validators attested so we need to find the intersection of our validators and the validators that attested.  That part of the select then becomes:

icount(get_indices(a.f_aggregation_bits, c.f_committee)::int[] & '{$perfValidators}'::int[])

Since we’re ultimately wanting just the total number of attestations included from our validators we can aggregate all those rows with the SUM function and our query becomes:

SELECT SUM(icount(get_indices(a.f_aggregation_bits, c.f_committee)::int[] & '{$perfValidators}'::int[]))
FROM t_beacon_committees c
JOIN (SELECT x.f_slot, x.f_committee_index, bit_or(RIGHT(x.f_aggregation_bits::text, -1)::varbit) AS f_aggregation_bits
FROM t_attestations x
GROUP BY x.f_slot, x.f_committee_index) a ON a.f_slot = c.f_slot AND a.f_committee_index = c.f_index

But that’s only have the story (fortunately its the more complicated half) – we also want to know how many attestations we should have produced. That means counting the number of times our validators appear in committees:

SELECT SUM(icount(c.f_committee::int[] & '{$perfValidators}'))
FROM t_beacon_committees c

It’s tempting to just use a COUNT(*) but that will find the number of committees any validator is in, but undercount the number of expected attestations if two of our validators are in the same committee. So we have to apply the same trick as we did for counting the number of attestations – find the intersection and then count.

We can filter the number of rows we have to inspect by only including committees containing our validators with:

WHERE c.f_committee && '{$perfValidators}'

So putting it all together we can get the percentage of included attestations in the current time range with:

SELECT SUM(icount(get_indices(a.f_aggregation_bits, c.f_committee)::int[] & '{$perfValidators}'::int[]))::double precision / SUM(icount(c.f_committee::int[] & '{$perfValidators}'))
FROM t_beacon_committees c
JOIN (SELECT x.f_slot, x.f_committee_index, bit_or(RIGHT(x.f_aggregation_bits::text, -1)::varbit) AS f_aggregation_bits
FROM t_attestations x
GROUP BY x.f_slot, x.f_committee_index) a ON a.f_slot = c.f_slot AND a.f_committee_index = c.f_index
WHERE c.f_slot >= ($__unixEpochFrom() - $genesisTime) / 12 AND c.f_slot <= ($__unixEpochTo() - $genesisTime) / 12
AND c.f_committee && '{$perfValidators}';

Which is up there with the least comprehensible SQL I’ve ever written.  But it works and it says Teku is awesome.

But why stop there? What if we wanted to know the number of correct attestations?

Remember way back up when we were first aggregating attestations? If we just filtered the attestations to the ones which have the right beacon block root, our aggregate will only count validators that attested to the correct head. A straight join to t_blocks will do the trick:

SELECT SUM(icount(get_indices(a.f_aggregation_bits, c.f_committee)::int[] & '{$perfValidators}'::int[]))::double precision / SUM(icount(c.f_committee::int[] & '{$perfValidators}'))
FROM t_beacon_committees c
JOIN (SELECT x.f_slot, x.f_committee_index, bit_or(RIGHT(x.f_aggregation_bits::text, -1)::varbit) AS f_aggregation_bits
FROM t_attestations x
JOIN t_blocks b ON x.f_beacon_block_root = b.f_root AND b.f_slot = (SELECT MAX(b2.f_slot) FROM t_blocks b2 WHERE b2.f_slot <= x.f_slot)
WHERE x.f_slot >= ($__unixEpochFrom() - $genesisTime) / 12 AND x.f_slot <= ($__unixEpochTo() - $genesisTime) / 12
GROUP BY x.f_slot, x.f_committee_index) a ON a.f_slot = c.f_slot AND a.f_committee_index = c.f_index
WHERE c.f_slot >= ($__unixEpochFrom() - $genesisTime) / 12 AND c.f_slot <= ($__unixEpochTo() - $genesisTime) / 12
AND c.f_committee && '{$perfValidators}';

It’s tempting to just test that the block root and slot matches, but then attestations that correctly pointed to empty slots wouldn’t be included. So we have to check that the block root is the last one at or before the attestation slot.

Or would could change that join to check if the target root (it points to the block root at the first slot of the epoch):

JOIN t_blocks b ON x.f_target_root = b.f_root AND b.f_slot = (SELECT MAX(b2.f_slot) FROM t_blocks b2 WHERE b2.f_slot <= x.f_target_epoch * 32)

And hey look, we made that incomprehensible SQL even worse!

This may not be the best use of chaind but it’s fun and the snippets of SQL here can be remixed and combined to do quite a lot of things as part of what’s investigating what happened with a chain. I suspect it would be a lot easier to have a denormalised table that recorded the first time a validator’s attestation was included to save all the bit twiddling and array intersections. It may come at a fairly large cost of disk space though.

Creating validator password files

Teku requires a password file for each individual keystore, with the same relative path as the keystore but a .txtextension.  If all the keystores have the same password, you can copy a single password.txtfile for all the keystores with the command below:

for keyFile in keys/*; do cp password.txt "secrets/$(echo "$keyFile" | sed -e 's/keys\/\(.*\)\.json/\1/').txt"; done

Note that the keys directory is specified twice in that command so if the keys are in a different directory name, make sure to update both places.

Easy SSH Access to EC2 Hosts

We run a bunch of hosts on EC2 and while most just have the default DNS name, they all have a Name tag to identify their purpose. While there’s lots of automation setup via ansible, it is often still necessary to SSH directly into a particular box, especially while debugging issues.

I find it annoying to have to go log into the AWS console just to find the DNS name of the particular box I want so I’ve written a little script that searches for hosts based on the name tag and can then SSH into them.

It requires having the aws  utility setup and able to login, plus having jq installed.

#!/bin/bash
set -euo pipefail
HOST_NAME=${1:-}
RESPONSE=`aws ec2 describe-instances`
INSTANCES=`jq "[.Reservations[].Instances[] 
    | select(.State.Code == 16) 
    | select(.PublicDnsName != \"\") 
    | select((.Tags[] | select(.Key == \"Name\")).Value | test(\"$HOST_NAME\"))] 
    | sort_by((.Tags[] | select(.Key == \"Name\")).Value)" <<< "$RESPONSE"`
HOSTS=(`jq -r '.[].PublicDnsName' <<< "$INSTANCES"`)
HOST_COUNT=${#HOSTS[@]}
if (( $HOST_COUNT == 0 ))
then
 echo "No matching hosts"
 exit 1
elif (( $HOST_COUNT > 1 ))
then
 NAMES=(`jq -r '.[].Tags[] | select(.Key == "Name").Value' <<< "$INSTANCES"`)
 echo "Multiple matching hosts found: "
 NUM=1
 for NAME in ${NAMES[@]}
 do
    HOSTNAME=${HOSTS[$((NUM - 1))]}
    echo "$NUM: $NAME  ($HOSTNAME)"
    NUM=$((NUM + 1))
 done
 read -p "Enter the number of the host you meant: " -r SELECTION
 SELECTED_HOST=${HOSTS[$((SELECTION - 1))]}
 ssh $SELECTED_HOST
else 
 ssh ${HOSTS[0]}
fi

Run with awssh <name> where <name> is a regex that matches any Name tag.  Typically a substring match is simplest.  So to SSH to the Medalla testnet bootnode we run I’d just use awssh medalla-bootnode.

If more than one node matches it provides a list of the matching nodes, so to select one of the Medalla nodes I’d run awssh medalla then select from the resulting list.