Redundancy and Reliability are not the One and the Same

Complexity EnemyIt’s been long ingrained in engineering, particularly in networking, that redundancy is key to high availability and reliability. In today’s world, however, increasingly more complex systems may sometimes have the opposite effect. Over-engineered networks with multiple redundant layers that are intended to reduce instances of network failure have shown to reduce overall reliability by becoming more prone to various issues. As systems grow, there can always be more opportunity for network elements to break or fail, and extra maintenance can lead to human errors. So what is the right level?

The notion of redundancy decreasing reliability isn’t new. In his book, Normal Accidents, an examination of technology in society, Charles Perrow suggested that high levels of redundancy can backfire and impact system integrity. With this in mind, the reality is that it isn’t as simple an equation as more redundancy equals higher reliability.

Adding more levels of redundancy to a system will undoubtedly offer continued availability in the case of component or even system failure, but as the system becomes more sophisticated, complexity rises dramatically. As more and more elements are added to increase redundancy, multiple protocols need to be employed to enable the components to recognize failure and command the backup to kick in. It then needs to be decided whether a drop in performance is severe enough for redundancy to come into play or whether a component needs to fail completely-the sophistication of networking protocols increases exponentially.

So the cost incurred by increasing the layers of redundancy is not limited to adding more components. Adding redundancy protocols and deciding on “break points” mean more complex network engineering and implementations, and at the end of the day, these levels of sophistication and complexity can’t always guarantee reliability.

100% redundancy will, if implemented correctly make a system highly reliable, but the cost, as network complexity and sophistication increases exponentially, can be prohibitive for many businesses. Equally, too much complexity can mean more opportunity for faults, so keeping it simple may have the same positive effect on reliability. To figure out the best balance for your business, examining multiple factors like cost, capacity resourcing and business needs is imperative. If you’re unsure of the optimal redundancy strategy, work with the experts to decide on the point at which the law of diminishing returns kicks in-when do extra components become more effort that the availability they deliver?