Intro

In Distributed Systems, achieving consensus among nodes is important to make the system reliable. The goal of such an algorithm is to, typically, manage a replicated log across servers such that the system is consistent with itself. For example, servers should not disagree on the value of X, because they learned the value of X through the replicated log.

Examples

  1. Raft
  2. Paxos

How is this achieved?

A big credit to distributed consensus algorithms come from the various Reliable Broadcast like Total Order Broadcast

Important factors in Distributed Consensus

Consistency & Reliability

Distributed systems should ensure that all nodes within a given system agree on a common state or decision. This should prevent situations in which one variable has different values across machines.

Fault Tolerance

No matter what machine fails, distributed consensus algorithms should be able to maintain fault tolerance. This includes if a leader crashes.