Friday, March 14, 2008

Netcool overview for Tivoli folks

The Netcool products are definitely upon us, and I just wanted to write a short description of some of the different products and how they fit in, from a traditional Tivoli perspective.

Tivoli has stated that the future of event management will be Netcool/Omnibus, and TBSM 4.1 *IS* Netcool/RAD, along with some additional cool integration pieces, so I'm going to focus on those products and their prerequisites.

Netcool/Omnibus is used for event management. It consists primarily of an ObjectServer and a database (Postgres by default). The ObjectServer receives events and performs functions similar to those provided by tec_server, tec_reception, tec_task, tec_rule and tec_dispatch. Specifically, it receives events, processes them according to configurable/definable rules, and writes them to the database. The rules you define can also perform automation - sending email, instant messages, running programs, etc.

The Omnibus database itself is quite a bit nicer (in my opinion) than the TEC database. There is essentially ONE table that contains your event information: alerts.status (there are a couple of others, alerts.detail and alters.journal that may contain information about events, but alerts.status is the primary one). All of an event's slots map to columns in this table, and if you define an event that needs more slots/attributes, you need to modify this table. This makes it a little less flexible than TEC's TEC_T_SLOTS table, but that's a good tradeoff in my mind (to this day I haven't been able to find a single SQL query that will show me all of the slots for all of the events with a particular value in the 'hostname' slot, for example).

The user interface for Omnibus itself is about as basic as the TEC event viewer. But because you normally will use other products along with Omnibus, most users won't actually see the Omnibus interface - they will use something nicer (like TBSM or Impact) as a front-end.

Defining the automation for Omnibus (using the Netcool/Omnibus Administrator tool, 'nco_config'), should be familiar to all TEC administrators out there. You have to write code to perform any automated actions you want. The product doesn't have any wizard interfaces, but Netcool/Impact DOES (more on this in a bit).

TBSM (Tivoli Business Service Manager) 4.1 sits primarily on top of the Omnibus database, though it can take feeds from a large number of different sources. Whereas previous versions of TBSM required that you send specially-formatted events to TBSM, this version performs the same function in a much more straightforward manner: by reading the database. As an administrator, you need to define your different business service hierarchy, and in doing so, you need to define which events affect the status of each service. You define these filters through the web-based interface, which is based on Netcool/Webtop, which is based on the Netcool/GUI Foundation.

TBSM 4.1 also includes some functionality that was not in RAD 3.x. Specifically, there is a new TEC EIF Probe, which allows the included Omnibus ObjectServer to look EXACTLY like a TEC server. This means that you can point existing TEC adapters to your Omnibus host as the EventServer. This piece also allows you to perform attribute mapping so that your events come in correctly.

Another new feature in TBSM 4.1 is that it can import Discovery Library Adapter (DLA) books that are created by other products. Most notably, it accepts the output from the ITM 6.x DLA, and even has rules built-in to handle events from ITM. Here's what makes this so cool:

- You can generate the book in your ITM environment. This book (a file containing IDML [an XML language] data) contains information about all of the agents you've deployed in your environment.
- You then have all of your agents visible within TBSM, and they can be included in any services you define.
- If you point your TEMS to send events to the Omnibus ObjectServer being monitored by TBSM, your systems that were imported from ITM 6.x will turn yellow or red.

TBSM 4.1 ALSO has tight integration with the CDT portion of CCMDB (aka TADDM). You can pull information from TADDM using DLA books OR using direct API access. This type of integration allows you to view CI status changes directly in TBSM. Additionally, you can launch the TEP or the TADDM GUI in-context directly from TBSM.

This level of out-of-the-box integration is what a lot of us have been hoping for for a long time. Additionally, TEC event synchronization capabilities are easily configured.

If you can't tell, I REALLY like this newest version of TBSM 4.1. It doesn't have nearly the complexity of earlier versions, AND leverages your event repository directly (the Omnibus database). Additionally, it ships with robust integration with ITM and TEC, which will make the transition off of TEC very easy for the vast majority of customers. For most customers who are using TEC for complex processing, it won't take too much effort to integrate Omnibus into your event management structure.

TBSM 4.1 also has an SLA (Service Level Agreement) component that can be used to track and manage your SLAs. Tivoli is still selling the TSLA product separately, and I believe they will keep offering that product, so hopefully they will soon come out with a statement of direction in this area.

With TBSM, you also get Netcool/Impact for no additional fee. *This* is the product that many of you have seen demonstrated with the ability to define your event management rules and automation just by clicking and dragging. This is accomplished through the Wizards that are included. Those wizards will guide you through many common automation tasks (running an external command, sending email, etc.), though like any wizards, to perform complex operations, you'll need to write code directly.

The main interface for Netcool/Impact is web based, and therefore, like TBSM 4.1, requires Webtop and the GUI Foundation.

Netcool also has a very granular security structure, where you can define exactly which users can access which resources depending on which tool they use for that access.

Notice in all of the above that Netcool has no interface that competes with with the ITM 6.1 TEP interface - all of the Netcool interfaces above are based on events being generated. That's a good thing, as this (to me) clearly indicates that the TEP is *the* operational view for real-time metrics moving forward.

That's all for now. There are definitely other Netcool products that I didn't touch on (Precision IP, Reporter, Proviso and Visionary, among a few others), but we will hopefully address those in a similar article soon. I know in particular that there is lots of interest in Tivoli reporting capabilities, and that Netcool/Reporter sounds like a good product to address that. However, Tivoli has announced that their future reporting solution will be based on the open-source BIRT (Business Intelligence Reporting Tool) application, so I don't really want to touch on that until Tivoli announces a more concrete direction.

Stopping situations on Remote TEMS

While it is advisable to manage your situations from hub, sometimes you might want to disable a situation just on one RTEMS. To disable a situation on a RTEMS, you need to use an undocumented/unsupported SOAP call. Please read on to know more.

The following SOAP call seems to work for me.


sysadmin
something
REMOTE_TEMS1
NT_Service_Error
situation


The key is the TEMSNAME. The tag is undocumented for obvious reasons.

ITM Events with wrong severity - A fix

If you are running ITM with fixpack 02 or later, then you might have noticed that ITM events from TEC are sent with a severity (usually UNKNOWN) different from the one specified in situation. This article explains a fix for this problem.

There are two issues here. One, there is a new parameter that should be added to KFWENV since FP02. Without this parameter, TEPS will incorrectly store the severity field (SITINFO) inside the situation definition. Second, after setting this parameter, we still need to manually modify the existing situations to have correct severity.

Adding new parameter to KFWENV/cq.ini

Edit your KFWENV/cq.ini file, add KFW_CMW_SET_TEC_SEV=Y to it. This parameter will take care of setting correct severity within the situation definitions.

Modifying severity in existing situations

Using a simple tacmd viewsit, you can export the situation definitions, modify the SITINFO field to right severity and import it using tacmd createsit command. But it is too destructive as it involves deleting the existing situation and creating new one.

One relatively easy method is to use gbscmd. If you would like to learn more about gbscmd, please read this article. For example, the following gbscmd changes the situation severity on the fly.

gbscmd executesql --auth --sql "UPDATE O4SRV.TSITDESC SET SITINFO='' WHERE SITNAME=''" --table O4SRV.UTCTIME

The new sitinfo should be exactly like the previous sitinfo field but with the severity changed to correct one.

New SOAP Methods for ITM 6.1 - It's a summer blockbuster

So one of the big requests in ITM 6.1 is "task" like functionality, so far we have been limited to Situation actions - effective, but not something we can program around.

Remote_System_Command and Local_System_Command. Just like they sound, they will allow system commands to be run on Remote systems.

Read the attached for usage samples and details.

Get Your KICK butt SOAP Methods here...

ITM 6.1 ????Netcool Omnibus Integration Steps

As IBM moves to Netcool Omnibus as its primary event handling mechanism, it is imperative to understand the integration mechanism of Omnibus with its other famous cousin, ITM 6.1. This article gives you a high-level overview of how the integration works and gives you necessary instructions to get the integration working.

Terminology
ObjectServerOmnibus ObjectServer is the in-memory database server at the core of Netcool Omnibus. ObjectServer receives events from various sources and process/display them according to the specification. This is analogous to TEC Event server. ProbesProbe is an Omnibus component that connects to an event source, receives events from that source and forwards the events to ObjectServer. For ITM integration, we need to use Tivoli Event Adapter probe a.k.a TME10tecad probe. EventListA GUI application that shows the events received by Netcool Omnibus. You can bring up the EventList by typing nco_event in your Unix terminal. Make sure you have X-Windows server such as Exceed running.
How does the integration work?
ITM uses OTEA (Omegamon TEC Event Adapter) to send events to TEC. The OTEA is very similar to a NT Event Adapter/TEC Event Adapter. To send events to Omnibus, you just need to install a TEC Event Adapter probe on a system and modify the ITM om_tec.config file to point to the server running the probe. To ITM, the probe will appear as a TEC server. The probe on receiving events from ITM, will forward them to the real Omnibus ObjectServer specified in its rules file. The following diagram shows the integration of these components.
Downloads
Setting up the software from scratch requires quite a few downloads from Netcool/IBM download site. The list of products needed are given below.
  • Netcool/OMNIbus, v7.1, AIX 5L 5X, HP-UX 11, HP-UX 10.x, Solaris (Sun Microsystems), Windows 2000/XP/Version 7.1 (Download Omnibus V7.1 for your platform, License Server and License file)
  • NetCool/Omnibus User, V7.1, AIX 5L 5X, HP-UX 11, HP-UX 10.x, Solaris (Sun Microsystems), Windows 2000 Version 7.1 (just the .lic file only)
  • Netcool/Omnibus Probes for Nonnative-base - eAssembly
  • Netcool/Omnibus Probes for Tivoli EIF eAssembly.
  • Download Tivoli & Netcool Event Flow Integration solution from OPAL. (TEC_Omnibus_IntegrationFlows.tar). The latest version as of this writing is V3.0.
  • Integration Steps
    Integrating ITM 6.1 with Omnibus involves the following major activities. 1. Install Omnibus and create an ObjectServer (if needed) 2. Install Tivoli Event Adapter Probe and point it to the ObjectServer created above. 3. Modify om_tec.config in ITM environment pointing it to the Event Adapter probe. 4. Reconfigure the Hub TEMS Server and specify the probe server as your TEC Server.
    Installation Steps - A Quick Overview
    The following steps are needed to get ITM-Omnibus integration working right from the scratch.
    1. Install Netcool Omnibus V7.1
    2. Install Netcool License server.
    3. Install your licenses in $NCHOME/license/etc folder.
    4. Create a Netcool Object Server to which events will be sent.
    5. Start the license server and ObjectServer.
    6. Install Omnibus Probe support on your probe server.
    7. Install Netcool/Omnibus Probes library for Non-native base on the probe server.
    8. Install Netcool/Omnibus Probes binary for Tivoli EIF on the probe server.
    9. On the probe server, extract ITM Omnibus Integration solution that you downloaded from OPAL and copy tme10tecad.rules files to %OMNIHOME%\probes\ directory.
    10. On the probe server, Create a file called %OMNIHOME%\probes\win32\tivoli_eif.props and the following lines to it.
    PortNumber : 5529
    Inactivity : 3600
    11. Bring up Server Editor and add the entries (ObjectServer name, hostname and port number) for your Netcool server.
    12. Start the Probe Service by starting the "NCO Nonnative Probe" Service.
    13. Change the om_tec.conf on the ITM hub server to reflect the connectivity information of the probe server that you did in step 10.
    14. Fire a test situation and ensure that the event is received by Netcool. You can verify this by bringing up Omnibus EventList (nco_event).
    The screenshot of events appearing in Omnibus EventList is shown below.

    Running the State Based Correlation Engine from ITM

    The ITM TEC integration provides the ability for situation events and ITM status events to be forwarded to TEC or Omnibus. Enabling the integration is fairly straight forward, but what is lacking is the ability to manipulate events as they are emitted from ITM. Some control over events can be achieved using the XML map files located in the TECLIB directory, but this level of control does not allow events to be manipulated programmatically. Any enrichment or correlation of events that could not be accomplished in a map file had to be done in TEC.
    Until now.
    The State based Correlation Engine (SCE) can be run from any of the recent TEC EEIF adapters and in reality the ITM TEC integration is simply a C based event adapter. Using the SCE will allow ITM events to me manipulated and correlated before they are sent to TEC.
    Running the SCE from ITM requires a little work. In this example I will use a Linux TEMS and implement the Gulfsoft SCEJavascript custom action to manipulate ITM events using Javascript programs.

    First, acquire the JAR files required to run the State based Correlation Engine from a TEC installation. The files needed are:
    zce.jar
    log.jar
    xerces-3.2.1.jar
    evd.jar

    Also required is the DTD file for your XML rules file. In this case I will use and modify the default XML rules file.
    tecroot.xml
    tecsce.dtd

    Create a directory such as /opt/IBM/ITM/sce and copy the files listed above to this directory.

    Since we will be implementing the SCEJavascript custom action we will also need scejavascript.jar and js.jar (included in the Gulfsoft package) both files will also be copied to this directory.

    Next we will have to modify the TEMS configuration file to successfully run the SCE. The file is named (on Linux) $CANDLEHOME/config/${HOSTNAME}_ms_${HUB_NAME}.config and contains environment variable settings for the TEMS.

    Find the entry for LD_LIBRARY_PATH and add
    /opt/IBM/ITM/JRE/li6243/bin:/opt/IBM/ITM/JRE/li6243/bin/classic
    to the existing entry. Depending on where ITM is installed and the version of Linux, the path may be different. As you can guess, I will be using the ITM provided Java for this example so there will be no need to download and install another JRE unless you really want to. Also in this file we will setup the initial CLASSPATH environment variable and point it to the minimum required JAR files:
    CLASSPATH='/opt/IBM/ITM/sce/zce.jar:/opt/IBM/ITM/sce/log.jar:/opt/IBM/ITM/sce/xerces-3.2.1.jar:/opt/IBM/ITM/sce/evd.jar'

    Be sure to add CLASSPATH to the export list.

    The next step is to modify the $CANDLEHOME/tables/$HTEMS_NAME/TECLIB/om_tec.config file to enable the SCE:
    UseStateCorrelation=YES
    StateCorrelationConfigURL=file:///opt/IBM/ITM/sce/tecroot.xml
    PREPEND_JVMPATH=/opt/IBM/ITM/JRE/li6243/bin
    APPEND_CLASSPATH=/opt/IBM/ITM/sce/js.jar:/opt/IBM/ITM/sce/scejavascript.jar
    #TraceFileName=/tmp/eif_sc.trace
    #TraceLevel=ALL

    Note that we are indicating the location of the ITM provided Java and we are adding the JARs needed to run our custom action.

    The next steps are to configure our Javascript code and modify the tecroot.xml file to run the custom action. The Javascript we will use will be a simple change to the msg attribute:
    function processEvents(events)
    {
    for(i=0;i<events.length;i++)
    {
    var foo="FOO:";
    events[i].putItem("msg",foo.concat(events[i].getString("msg")));
    }
    }

    We will call this file test.js and save it in /opt/IBM/ITM/sce.

    Finally we will modify the tecroot.xml file to run the custom action:
    <?xml version="1.0"?>
    <!DOCTYPE rules SYSTEM "tecsce.dtd">

    <rules predicateLib="ZCE">

    <predicateLib name="ZCE"
    class="com.tivoli.zce.predicates.zce.parser.ZCEPredicateBuilder">
    <parameter>
    <field>defaultType</field>
    <value>String</value>
    </parameter>

    </predicateLib>

    <rule id="itm61.test">
    <match>
    <predicate>true</predicate>
    </match>
    <action function="SCEJavascript" singleInstance="false">
    <parameters><![CDATA[/opt/IBM/ITM/sce/test.js]]></parameters>
    </action>
    </rule>
    </rules>

    Once all of the changes have been implemented, stop and start the TEMS.

    All of the event that come out of ITM will now have messages starting with "FOO:", check back for more useful examples...

    Using TPM for patching Windows

    TPM (and TPMfSW) provides the ability to patch Windows computers through a couple different methods. In this blog, I will summarize the various methods.

    There are 2 ways of doing Windows patching in TPM
    1. Using the Deployment Engine
    2. Using Scalable Distribution (SOA)

    So the first thing is to determine the method you are using.

    The Deployment Engine is better designed for a data center environment where the network is not a concern. This is because using the DE does not provide any bandwidth control or checkpoint restart. It does not use the depots for fan out distributions. It is a straight file copy. With the DE there are actually two methods that can be used. The first (and best) is to have the Windows Update Agents (WUA) talk to an internal WSUS server. The second (would not recommend) is to have the WUA talk directly to Microsoft.

    SOA is used for the distributed environment. If you have many computers to distribute to and there are targets on the other end of a slow link you will want to use this method. This requires that the TCA (Tivoli Common Agent) is installed on all target computers and that the SOA-SAP has been enabled. You will also require at least one depot server (CDS).

    If you are using SOA, the TPM server will have to discover and download the patches directly from Microsoft (there is a proxy config you can set too).

    Ok so now you have the method you want to use. How to implement it?

    DE
    In order to use the DE method the following tasks need to be completed (I am going to assume that you are using the WSUS server method)
    1. Install and Configure the WSUS server (approve and download desired patches)
    2. Set the global variable WSUS server
    After this the steps between DE and SOA are the same so I will list them after listing the SOA tasks

    SOA
    1. Configure the Windows Updates Discovery discovery configuration.
    2. Execute the Windows Updates Discovery. This will populate the DCM with all patches available according to the filters you set (much like WSUS). Remember, this is only the definitions for the patches not the binaries required to install them.
    3. Approve patches
    4. Execute the MS_SOA_DownloadWindowsUpdates to download the files from Microsoft.

    Common Steps
    Now that the desired repository is setup you need to complete the following.
    1. Install the WUA on all targets
    2. create a patching group. Under the compliance tab, add a security compliance check called Operating System Patches and Updates.
    3. Execute the Microsoft WUA Scan discovery configuration
    4. In the Compliance tab, select Run -> Run Compliance Check. Once the task is complete, the Compliance tab will show if there are computers out of compliance.
    5. Click on the number under the Compliant header (something like 0/1)
    6. Select the desired patches and computers and press the Approve button.
    7. Select the desired patches and computers and press the Run/Schedule button (Note: the Run button does not work for SOA distributions)
    8. Once the distributions are complete, run the Microsoft WUA Scan again and then the Run Compliance Check.

    Done!

    Let me know if you have any comments/questions. Complaints > /dev/nul ;)

    Thanks to Venkat for all his help!

    Martin Carnegie
    martin dot carnegie at gulfsoft dot com