If you are a Windows guy and you haven’t touched Linux so far implementing SCOM will force you to get hands on *nix systems. Here I would like to provide a cool, little way how to overcome a limitation of the Unix/Linux Shell Command Two (or Three) State Monitor.
This two state monitor allows you to call a shell script or a one-line command sequence (using pipeline operators). That means you just can call one command “one-liner” using the pipe symbol “|” e.g. ls –l /tmp | wc –l . This example will count the files/directories in the /tmp directory. The “ls –l” command is similar to the Windows “dir” command and then the output is sent to the “wc –l” command which counts the words by line (wc=word count). But the real world is that most scripts on the Linux side are not just one-liner. A Linux guy might creates a script or he might asks you if you can execute a script which calls another script in Linux. Sounds complicated? No, I show you…
In the /tmp directory I created two text files countfile.sh and runcount.sh (don’t get confused about the .sh ending, these are just plain text files).
The countfile.sh has two lines…
- #!/bin/bash => Which shell executes this script
- ls –l /tmp | wc –l | bc => Will count the files in the /tmp directory. I just added the “bc” command which is used to convert to an integer value. But for this example you would not need it.
The runcount.sh file has also two lines…
Notice here the line…
. /tmp/countfile.sh => This line calls the countfile.sh file AND returns the output in the same shell. The “.” (dot) makes this possible, if you don’t use it the command would execute the countfile.sh in a separate shell and you would not be able to catch the value.
Next we need to make these scripts executable and readable. How do we do that? We set the permission of the files to read and execute using the “chmod” command. You must set these permissions or you won’t be able to run the scripts.
You can no check if it works by executing the /tmp/runcount.sh script…
We have 20 files and directories in the /tmp directory! Cool!
Now lets build the monitor…
Give it a name and choose a class in my case “SUSE Linux Enterprise Computer”…
For this testing I just create choose to run every 30 seconds (choose a higher interval for production! e.g. 30 minutes)…
Next you need to provide the shell command…
If there are more than 10 files in the directory throw an error alert…
And if there are equal or less than 10 files the monitor will be healthy…
We leave it the way it is…
Adjust the alert settings to your needs…
Notice here I added the line
$Data/Context///*[local-name()=”StdOut”]$
in the description field. This contains the output from the script in our case the file/folder count.
After a short time you receive an error if the threshold is reached…
and the alert properties…
In this example I showed you how to
- create simple shell scripts
- how to call a shell script from a shell script
- how to use the “script in script” in SCOM
This will help you to overcome the “one-liner” limitation and the limitation to just execute one script.
If you are in the situation where you need to monitor Linux systems, I always try to force the Linux guys to build all the logic into their scripts and just return the values of the monitored state. On the SCOM side I am just calling the script and make the corresponding mapping to their scripts. E.g. if the script output is “0” and means unhealthy and “1” means healthy I map this to the two state monitor. You also could you words like “NOK” or “OK” for unhealthy and healthy state.
I hope you find this useful
…



Great blog you have here – very informative. Glad I stumbled onto it.
Hi Jonathan
Thank you very much! It’s an honor to have you reading my blog!
Regards,
Stefan
Hi , can i detect a log file and run command, not use timer ? thx.
Hi
Sorry, I am not quite sure what you mean. Can you give me some more details?
Regards,
Stefan