Thursday, March 13, 2008

Updated - ITM 6.1 Success Story - 99.998% availability

One of my customers, who shall remain nameless for now, has a support contract with Gulf Breeze Software to maintain and monitor their Tivoli Infrastructure. This encompasses everything from User Admin, TDW reporting, remote monitoring of the ITM 6.1 and TEC infrastructure, custom situations and patch deployment.

We use site to site VPN from our data center to maintain a constant network connection that allows remote monitoring and administration of the ITM and TEC infrastructure

I am excited because over the last 3 months, we have achieved 99.998% uptime/availability of the monitoring infrastructure. This is equivalent to 51 seconds of unplanned outages over 3 months. People always want to know, what is availability?

Here is my definition - ITM 6.1 is available IF:

1) I can login to the TEP
2) I can view realtime data from an agent on a remote tems
3) I can view historical data from the TDW
4) TEC has processed a heartbeat event in a specified time frame
5) I can successfully access the Universal Message Console on the HUB TEMS.

If I can do all of these - then ITM is ready for business.

While ITM 6.1 has some issues to overcome, the overall code is proving to be stable. Most of the procedures used are located on this web site in the BLOG, however every situation (no pun intended) has its own issues and will require experienced individuals to implement a solution correctly and quickly.

Items such as the TEC Heartbeat, TDW Last Write UA and the SOAP server are all actively being used to achieve our high availability numbers.

