Tuesday, February 09, 2010

Problem Management wihtin IT operations

IT operations is under considerable pressure to drive down costs, innovate and provide better services to its customers. Problem management forms a key cog in shaping these objectives to some extent. Here are my thoughts on how organizations can adopt problem management within operations.

Problem Management can be considered as helping improve the characteristics of services in the long term for IT operations. Taking the view of ITIL v3, and the 360 view of operations, problem can help improve the services on 2 dimensions

  • service warranty¬† >> how does capacity, BCP security impact failures and how can these risks be mitigated
  • service utility >> how are architectural decisions in infrastructure, application and functional design of applciation creating issues or problems for sustainable reliable IT operations per service

With this in view problem management can initiate root cause analysis using
  1. Reactive incident failure mode effects and analysis
  2. Incident , requests trends
  3. Pro-active performance analytics of key services using End user monitoring and system component performance monitoring
Root cause analysis is then followed by actions either through
  1. CR that modify system utility and warranty characteristics
  2. SIP initiatives that help improve non- delivery based capabilities of operations
from this perspective we also see that problem management is closely linked service design processes and CSI

What is critical to problem management success is that the operations maintain high visibility and transparency to their work within IT operations, ensures proper chinking and packaging of solutions to initiate continuous improvement and change within IT operations, provide the necessary tooling infrastructure to help diagnose infrastructure problems on an on -going basis and finally and most importantly, ensure that problem management is a common shared interest and is driven bottoms up by IT operations personnel across the board.

The management aspects of problem management that are indicators of success are, in-process knowledge management tools, lean based throughput accounting of problem reporting to management and finally maintaining a single queue of requests through which all problems, errors and changes and managed [ also helps provide higher visibility to the entire people base.

As has been niftily described in the following article, shit - happens and problem management cannot eliminate the need for incident management but a good problem management can remove the unnecessary overhead and help IT resources focus on bigger and more exciting issues facing IT today.