"Dealing with failures...is our standard mode of operation"
Nick Carr discusses Amazon's Dynamo system, "used to support many of the most critical elements of Amazon's operation including shopping-cart processing", with a focus on a paper by Amazon's CTO, Werner Vogels, and a number of coauthors titled Dynamo: Amazon’s Highly Available Key-value Store.
I mention it here due to this wonderful quote that appears in The Introduction on page 1:
Dealing with failures in an infrastructure comprised of millions of components is our standard mode of operation; there are always a small but significant number of server and network components that are failing at any given time. As such Amazon’s software systems need to be constructed in a manner that treats failure handling as the normal case without impacting availability or performance.
Dealing with failures as a standard mode of operation.
That's something worth thinking about.
Recent Comments