[solved, but found bug] Automatic alarm termination won't work in just one case

Started by tarnmensch, August 28, 2015, 03:52:53 PM

Previous topic - Next topic

tarnmensch

Hi there!

Sorry if I made some stupid mistake, but I just can't find it... My newly created DCI generates an alarm but doesn't terminate it automatically, though the other alarms that seem to be set up the same way work fine. Here's what I've got:


  • DCI: System.CPU.Usage15, triggering the event DC_HIGH_CPU_UTIL after 3 poll values greater than 70, deactivation event is set to DC_HIGH_CPU_UTIL_OK.
  • Activation event: DC_HIGH_CPU_UTIL, Minor, Write to event log, Message: CPU-Last dauerhaft über %3 (Derzeitiger Wert: %4 für %2)
  • Deactivation event: DC_HIGH_CPU_UTIL_OK, Normal, Write to event log, Message:CPU-Last wieder unter %3 (Derzeitiger Wert: %4 für %2)
  • Generation policy: if DC_HIGH_CPU_UTIL generate alarm %m with key HIGH_CPU_%i and execute action UHD-Mail
  • Termination policy: if DC_HIGH_CPU_UTIL_OK terminate alarms with key HIGH_CPU_%i and execute action UHD-Mail

To test the alarm, I added 71 as a transformation script and set the poll time to 9 seconds. After the third poll, the alarm is triggered and an e-mail is sent. But when resetting the transformation script to "+0", the alarm doesn't get terminated, no e-mail is sent and there's not even a log entry. If I noticed correctly, the "Last Values" tab even shows threshold OK just when the first poll is done, instead of waiting for 3 polls.
The fact that there's no log entry created tells me that the event processing policy isn't even involved, it seems the event just won't be triggered.

Maybe I'm just blind, but please, can someone Show me what I'm doing wrong here?

Thanks a lot!


Edit:

  • I think part of the problem was my transformation script. Instead of deleting it, I changed it from "$1+71" to "$1+0", but from that point, the graph shows almost all the time the value 0, with some short peaks to +1 as well as -1 :o. As a CPU load of -1%  is rather unlikely, I deactivated the transformation script, and now the values stay around 2 to 4.
  • Testing the same settings with System.CPU.Usage works just fine... Is it just incorrect to set a transformation script or new thresholds to trigger and terminate alarms? (For some scenarios it's rather hard to simulate failures for testing purposes, so setting normal values as critical ones is a very easy way to test the alarm processing.)


Whoaps, sorry - but maybe this topic should stay up, just in case someone will try the same... Here's my conclusion:

  • To test your alarms, don't change your transformation scripts just to get your values into and out of critical ranges - in my case terminating the alarm didn't work.
  • If you don't need your transformation script, delete it. Setting it to "$1+0" may cause errors.

Victor Kirhenshtein

Hi,

from your description it actually looks like a bug. Using transformation scripts to simulate different values is fine, and scripts like $1+0 should work as well. I'll try to repeat this on my test system.

Best regards,
Victor