Everyone is really excited about the flexibility and re-usability afforded by loosely coupling software components and reusing the best stuff rather than building the same functionality over and over again. But, does anyone really have a handle on what fixing problems in SOA environments will look like, or any sense of how service oriented architectures will affect the mean time to repair in a typical organizat Decades ago, when we were all computing on mainframes, the application stack was pretty simple. Programs were all running in core memory on the same machine as the operating system and the data store. There was one place to look for problems, in the system transaction log.Fast forward to the year 2005 and our environments are distributed architectures based on a multitude of Open Source and proprietary components. Software is hard. Distributed software is hard to the power of n. Each application requires many tiers of technology. It’s not uncommon for a mission critical application to require a hundred or more physical machines and devices. Analyst firms like IDC have sized the total spend on managing all this datacenter complexity at upwards of $100B a year.I was reading through another InfoWorld blog today about the Cultural roadblocks to SOA development and I was surprised the survey respondents didn’t call out troubleshooting concerns as a roadblock to SOA adoption. Everyone is really excited about the flexibility and re-usability afforded by loosely coupling software components and reusing the best stuff rather than building the same functionality over and over again. But, does anyone really have a handle on what fixing problems in SOA environments will look like, or any sense of how service oriented architectures will affect the mean time to repair in a typical organization? I’m not suggesting that we throw our hands up in the air and give up on SOA because we’re scared of how hard it will be to find and fix problems in a more complex SOA world. But think about all the extra places we’ll need to look – WS* transaction logs, UDDI repositories and yes all those JMS events – to try and figure out where our orchestration is busted.Do you have a SOA or Web Services troubleshooting story to share? Become part of the discussion on this important topic and write me at thebaum@splunk.com. Technology Industry