by Michael Baum

Higher Availability Future: Autonomic Computing or Recovery Oriented Computing?

analysis
Feb 18, 20062 mins

It is fascinating to me that so many smart people can disagree on the best future approach to higher availability infrastructure. The

autonomic computing crowd led by IBM is touting self-healing and self-regulating computing systems. On the other hand the recovery oriented computing (ROC) folks led by researchers at Berkeley and Stanford declare failures are inevitable. ROC proposes the key to higher availability is helping humans to recover infrastructure from failures faster.

I have written here previously about ROC, but its time to start a dialog on comparing and contrasting these two radically differing views on the future of better infrastructure availability.

You notice I am talking about infrastructure availability not individual system availability. As an industry we have focused for decades on building more reliable individual components and systems. But now the reliability problem has moved to a different level. Take all these highly reliable components and systems and put them together with software developed by multiple vendors or adopted from different open source projects and the reality of complex systems settles in.

Can we build autonomic computing infrastructure that is self-healing and self-regulating beyond simple problems and single systems? Or will humans always be an important part of repairing and recovering IT infrastructure?

Our friends from Berkeley and Stanford offer an interesting perspective dubbed the Ironies of Automation. Their argument goes something like this.

Automation does not remove human influence, but instead reduces IT personnel understanding and can actually make their job harder. Automation increases complexity, reduces visibility and provides no day-to-day interaction and learning. ROC argues for better tools to help, not replace people.

So what do you think? Autonomic Computing or Recovery Oriented Computing? Which will lead us to higher availability infrastructure? Send me your vote to thebaum@splunk.com,