In theoretical computer science, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees: Consistency: Every read receives the most recent write or an error. Here is a list of coding interview questions on CAP Theorem to help you get ready for your next data structures interview in 2021.
What Is CAP Theorem?
The CAP Theorem for distributed computing was published by Eric Brewer. This states that it is not possible for a distributed computer system to simultaneously provide all three of the following guarantees:
- Consistency (all nodes see the same data even at the same time with concurrent updates )
- Availability (a guarantee that every request receives a response about whether it was successful or failed)
- Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)
The CAP acronym corresponds to these three guarantees. This theorem has created the base for modern distributed computing approaches. Worlds most high volume traffic companies (e.g. Amazon, Google, Facebook) use this as basis for deciding their application architecture. It's important to understand that only two of these three conditions can be guaranteed to be met by a system.
Why is CAP Theorem true?
It's proofed by construction. Basically we demonstrate a single situation where a system cannot be consistent and available in the same time:
If a client writes to one side of a partition, any reads that go to the other side of that partition can't possibly know about the most recent write. Now you're faced with a choice: do you respond to the reads with potentially stale information, or do you wait (potentially forever) to hear from the other side of the partition and compromise availability?
What does the CAP Theorem actually say?
The CAP Theorem says that it is impossible to build an implementation of read-write storage/system in an asynchronous network that satisfies all of the following three properties:
- Availability - will a request made to the data store always eventually complete?
- Consistency - will all executions of reads and writes seen by all nodes be atomic or linearizably consistent?
- Partition tolerance - the network is allowed to drop any messages.
More informally, the CAP theorem tells us that we can't build a database/system that both responds to every request and returns the results that you would expect every time.
Can you 'got around' or 'beat' the CAP Theorem?
No. You might have designed a system that is not heavily affected by it. That's good.
What is a Partition in CAP Theorem?
One such fallacy of distributed computing is that networks are reliable. They aren't. Networks and parts of networks go down frequently and unexpectedly.
A partition is when the network fails to deliver some messages to one or more nodes by losing them (not by delaying them - eventual delivery is not a partition).
The basic idea of CAP proof is that if a client writes to one side of a partition (namely, network fails), any reads that go to the other side of that partition can't possibly know about the most recent write. The proof of CAP relied on a total partition. In practice, these are arguably the most likely since all messages may flow through one component; if that fails then message loss is usually total between two nodes.