Taraxa Consensus (5/5): True Finality for Block DAG via an Asynchronous & Lightweight PBFT

Steven Pu • 8 min read • Jul 7, 2019

Taraxa Consensus (5/5): True Finality for Block DAG via an Asynchronous & Lightweight PBFT

In one of our previous articles, we defined how our Block DAG is structured and ordered, but we also raised the question that ordering can still shift over time. In this article, we present Taraxa’s method of introducing true finality into Block DAG ordering and all the associated consequences.

Importance of Finality

Most blockchain topologies today have probabilistic finality, in that you are never 100% sure whether or not the transactions are truly set in stone. For example, in BTC, you get an exponential falloff in the probability that an attacker will be able to catch up to the rest of the network and reorganize the blocks as time (or in the world of BTC, # of blocks) goes forward. This exponential risk falloff gave rise to the "6 block" rule of thumb whereby once you’ve observed 5 blocks built on top of the block containing your transaction, making your transaction "6 blocks deep", then it is statistically highly unlikely that your transaction would be part of some reordering attack.

In many cases, having a probabilistic finality is fine. But in cases where you’re executing a large number of transactions that are dependent on previous outcomes, or if you’re executing one extremely valuable transaction where you need to be absolutely sure that the transaction won’t be reversed, then finality becomes critically important.

In its native state, the ordering mechanism in Taraxa’s Block DAG is also one that provides probabilistic finality. To achieve true finality, we’ll need something extra.

What does Finality Mean in Taraxa?

Recall that reordering risk from our earlier ordering mechanism is primarily incurred when the anchor chain changes.

Re-organization happens when the Anchor Chain changes
Re-organization happens when the Anchor Chain changes

So as long as we can make sure that, periodically, an anchor chain can be nailed down, then the order for the blocks along that anchor chain will have finalized ordering no matter what happens.

How do we do that? The network will periodically take a vote, in parallel with the Block DAG’s construction, to place infinite weight on a specific block near the frontier of the DAG.

When a block has been given infinite weight, that means all the blocks it points to directly and indirectly via the ghost pointer now all have infinite weight, meaning it’s now impossible to override the ordering with an attack.

True Finality means anchor blocks with infinite weight
True Finality means anchor blocks with infinite weight

In the figure above, we selected the orange block to have infinite weight, and we can see that infinity has carried through backwards into the Block DAG to give infinite weights to all the blocks (marked in dark gray) along the anchor chain. Now we have effectively finalized the ordering of this anchor chain and all the blocks in its associated epochs.

So how do we select which block to give infinite weight to in the first place?

Period Block Selection via PBFT-like Algorithm

To select a block inside the DAG to finalize, we leverage a process that’s similar to the Practical Byzantine Fault Tolerance (PBFT) algorithm. Because PBFT is a well-studied and widely-deployed algorithm, we won’t spent too much time describing every detail and potential faults that could occur in this article, just the broad strokes.

There are roughly four (4) steps in our PBFT period block finalization process, as described in the highly simplified diagram.

Simplified illustration of Taraxa's PBFT finalization process
Simplified illustration of Taraxa's PBFT finalization process

Let’s go through this one by one. In each phase, a node …

1. Proposes New Block

  • Computes his eligibility via VRF (SK, previous_PBFT_block_hash, current_vote_type, current_round_number, current_step_number) = (e, π), where e is the eligibility value, and π is the proof of correct VRF calculation
  • Decides if e < threshold then he is eligible to propose a PBFT block this round
  • Picks a DAG block candidate to finalize, somewhere near the frontier but not at the frontier, this is the current Period block candidate Pt
  • Constructs a Period (more on this later in this article) between the Pt and P(t-1), find all the blocks included within the Period
  • Constructs a Concurrent Schedule, CS
  • Constructs a PBFT block candidate (Pc) that includes (Pt, CS), among other information (e.g., timestamp, signature, beneficiary)
  • Computes the hash of Pc
  • Broadcasts hash(Pc), Pc, along with his eligibility proof (e, π) to his peers

2. Votes on Leader

  • Computes his eligibility via VRF again, with current_round_number incremented by 1 and the current vote type, producing another (e, π)
  • Decides if e < threshold, then he is eligible to participate in this round
  • Waits for 2λ, where λ is the network diameter — the shortest distance (in time) between the two most distant nodes in the network
  • Computes the lowest e observed for which π is also correct, the creator if the lowest e is the "leader" — this node is the candidate to propose the next PBFT block
  • Broadcasts his vote for the hash(Pc) that corresponds to the lowest e to be the leader, along with his eligibility proof (e, π) to his peers

3. Votes on Block

  • Computes his eligibility via VRF again, with current_round_number incremented by 1 and the current vote type, producing another (e, π)
  • Decides if e < threshold, then he is eligible to participate in this round
  • Waits for 2λ
  • Computes if he has received 2T+1 votes for any given e_min, where T is 1/3 of the eligible voting nodes for the last round
  • Polls peers for the Pc that corresponds to the e_min and the associated hash(Pc) if he does not yet have the PBFT block
  • Validates Pc to be correctly constructed
  • Broadcasts his vote for Pc, along with his eligibility proof (e, π) to his peers

4. Votes to Move Forward

  • Computes his eligibility via VRF again, with current_round_number incremented by 1 and the current vote type, producing another (e, π)
  • Decides if e < threshold, then he is eligible to participate in this round
  • Waits for 2λ
  • Computes if he has received 2T+1 votes for any given Pc
  • Validates the winning Pc to be correctly constructed
  • Computes the newly validated Pc and commits results to permanent storage
  • Broadcasts his vote to move forward to proposing the next PBFT block, along with his eligibility proof (e, π) to his peers

A Few More Words on Taraxa’s PBFT

That was an extremely simplified description

That was an extremely simplified description of what happens in our PBFT process. It is simplified because we haven’t described all the ways things could go wrong, e.g., no node computed an e that was below the threshold, votes do not reach 2T+1 quorum, large numbers of nodes crashing during the middle of a round, etc.

This PBFT process is highly secure and scalable

Note that every time a node tries to speak, it computes a VRF eligibility value to make sure it is allowed to speak during that round. The eligibility threshold is set and dynamically adjusted to ensure that 2 things,

  • The nodes that participate in each round is random and likely different, meaning that once an attacker observes a specific node is a participant and marks him for attack, it’s likely that during the next round he is no longer eligible. This is in contrast to many other algorithms that keep the participant (or committee member) around for an extended period of time, making them prime targets for targeted attack.
  • Only a subset of eligible nodes participates in any given round, making this PBFT process highly scalable. This means that even as the network size scales up and the number of eligible participants increase, the number of actual participants (committee size) of these PBFT rounds can easily be set up to scale sub-linearly vs. the network size. Smaller number of participants means faster voting processes.

Combine randomly selected participants and sub-linearly increasing committee size, we get a highly secure and scalable PBFT process.

The Parallel PBFT Chain

Taraxa’s PBFT process creates a linear chain of PBFT blocks alongside the existing Block DAG.

Asynchronous & Parallel PBFT Chain
Asynchronous & Parallel PBFT Chain

Each PBFT block’s primary purpose is to finalize a DAG block into a Period block. The PBFT process confirms a single block inside the Block DAG. Hence, it is not used, like in most other networks that leverage the PBFT process, as the primary consensus algorithm which gates the progress of the entire blockchain. This is why in Taraxa the PBFT process happens in parallel and largely asynchronously vs. the Block DAG construction process.

Each time a new DAG block is finalized, we create a finalized anchor chain and an associated set of blocks along that anchor chain (via epochs along each anchor block). This entire set of blocks is called a Period, which can be thought of a snapshot of a cluster of blocks that have achieved finalized ordering.

Each Period contains many DAG blocks, which leads us to the other task of the PBFT block, to define the order in which transactions are to be computed, via the concurrent schedule.

This concludes our five-article arc describing Taraxa’s core consensus, but we are working on many more interesting technologies beyond just consensus, and we’ll share them as our research and implementation progresses.

Stay tuned!