1. Failures


<aside> 💡 Fail-Stop failures: falure ⇒ stop computer

</aside>

Replication can’t solve problems like:

And may solve problems like:

2. Challenge


  1. Has primary actually failed?
  2. How do we keep primary / backup in sync
    1. Apply all changes in the right order
    2. Deal with non-determinism
  3. Fail over

3. Two Approaches


  1. State transfer ⇒ Send snapshots to the backup
  2. Replicated State Machine ⇒ Only send operations to the backup

<aside> 💡 Level of operations to replicate