GUARDIAN ANGEL
Homeseer Support for Monitoring Processes and Taking
Recovery Actions
By Michael McSharry
September 11, 2003
1 Application / Script Interface
1.5 Guardian Angel Install Script
GUARDIAN
ANGEL
Guradian Angel is an executable that runs silently in a real time (highest) priority process and monitors the behavior of other processes or the computer overall. The measurement currently observered is the CPU utilization over a period of time for a particular process or for the computer overall, or for the absence of a process that is intended to be running. When the threshold is reached, or process disappears, then specified recovery actions are performed. The recovery actions are typically to terminate the offending process and restart it or to restart the computer.
Guardian Angel is a watchdog that will monitor a process without any interaction needed with the process and without any external hardware additions. It will successfully detect a process that utilizes an excessive amount of CPU time and allow recovery from this situation. It will do the recovery in a graceful manner by first asking the offending process to close itself down. If not successful then it will terminate the process in which the application is running. If that is not successful it will then ask all applications to shut themselves down, followed by termination of each if needed and this followed by restart of Windows.
If Homeseer is one of the
processes that Guardian Angel observes then it will expect a heartbeat which is
the continued recreation of the file "\Data\GuardianAngel.pulse". If this file becomes stale then Guardian
Angel will take the same recovery action
as if the process has gone into a mode of high cpu utilization.
Guardian Angel will not be successful at detecting a lockup condition in which CPU utilization is not an attribute of the lockup. Guardian Angel has the advantage over a hardware watchdog circuit in that is will not abruptly stop a process without warning as is the case with external watchdogs. It also has the advantage in that is can selectively terminate a single offending process without restarting windows and/or the computer. It downside is that depending upon the integrity of the Windows OS and it is limited in the nature of failures that it will detect.
Table 1 Revision History
Rev |
Date |
Description |
1.1 |
8-24-02 |
Allow application 15 seconds to close before terminating process. Previously it was 1 second. |
1.2 |
9-11-03 |
Retain PID for a window after initially determined |
|
|
Add ability to detect when a process has stopped running |
|
10-9-04 |
Add Pulse Monitoring |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
GuardianAngel.exe is started as any application with input parameters. From homeseer it can be launched from an event or launched from a script. It can also be launched from Windows by including it in the Startup Group or other folder that can be accessed with a mouse double-click.
Guardian Angel optionally accepts a parameter string. Each parameter in the string is separated by a “|”. The following parameters, and there defaults are itemized:
Table 2 Guardian Angel Parameters
Description |
Values |
Default |
1. Process to be Monitored |
Process name as defined by Windows performance monitoring. This will usually be the common name for the application. An empty string ("") indicates for the total of all processes. If the name contains a period (".") then presence of this process is being monitored rather than cpu utilization |
“” |
2. Threshold of CPU utilization |
Integer measurement as a percentage of total CPU use. If application presence is being monitored then this value can be anything since it is not used |
95 (percent) |
3. Duration for the threshold |
Number of seconds that the average utilization must be above the threshold for cpu monitoring If application presence is being monitored then this is the number of consecutive seconds that the monitored process has been missing from the process list. It is also used for Pulse monitoring. The file …Homeseer\Data\GuardianAngel.Pulse must be updated within this period to avoid an action. |
300 (seconds) |
4. Window Title of process to be terminated |
A text string containing either part of the application’s window title or the filename of the application. A period “.” Contained in the string will determine if it is a filename or a window title. If a title, then all applications with this title segment will be terminated. Use care if short strings are used because it may close down more windows than expected “restart” is special case reserved word to indicate that the computer is to be restarted If monitoring if for a missing process then this field can be anything since it is not used. |
restart |
5. Application to Launch |
The action that will be taken 10 seconds after the termination action is taken. Empty sting indicates no action. |
“” |
6. Priority in which to run the Launched application |
REALTIME, HIGH, ABOVE_NORMAL, NORMAL, BELOW_NORMAL, LOW |
normal |
The following are examples of methods in which to start Guardian Angel and how parameters can be formed:
parm
= "homeseer|90|500|restart "
q
= " " " "
CreateObject(“Wscript.Shell”).Run
q & "C:\Program Files\Homeseer\GuardianAngel.exe " & q & “ “ & parm,0,0
parm
= "homeseer.exe||500|| C:\Program Files\Homeseer 2\Homeseer.exe "
q
= " " " "
CreateObject(“Wscript.Shell”).Run
q & "C:\Program Files\Homeseer\GuardianAngel.exe " & q & “ “ & parm,0,0
hs.launch
hs.GetAppPath & "\GuardianAngel.exe ", "homeseer|90|120|homeseer.exe|C:\Program
Files\Homeseer\Homeseer.exe|above_normal "
C:\Program
Files\Homeseer\GuardianAngel.exe
parm
= "notepad.exe||500||notepad.exe "
q
= " " " "
CreateObject("Wscript.Shell
").Run q & "C:\Program
Files\Homeseer\GuardianAngel.exe” & q & " " & parm,0,0
Guardian Angel provides feedback in the homeseer log when
it is started. This provides the acknowledgement
of what is being monitored and what will be done should the monitored situation
arrise. It is a formatted feedback of
the input parameters. It will be obvious
by looking at the log if the correct monitoring conditions are recognized by
Guardian Angel.
Guardian Angel also maintains its own log “GuardianAngel.log” in the App path \Data folder. This will normally be C:\Program Files\Homeseer\Data when GuardianAngel.exe is located in the Homeseer root directory. It will contain the same invocation feedback, as well as feedback on each event and action taken.
Care should be taken to understand the nature of the process that is being monitored. Iterative scripts are either CPU hungry in their own right, or they make use of waitsec/waitevent with has the effect of charging homeseer for the time during the wait. I have some CPU intensive applications and these are all run in processes separate from homeseer. While the CPU utilization is high, the homeseer utilization numbers remain low.
The capability is provided with Guardian Angel to terminate selective processes. What one does not know, however, is what bad things the offending process has done to not only itself, but to other resources running on the computer. For mission critical applications it is probably best to restart the computer when the condition is detected.
I do not have a platform that generates high cpu utilization
lockups so I really do not know what the threshold / duration profile should
look like. Likewise I could not actually
evaluate it on a computer that actually did go into this undesired state.
Guardian Angel makes extensive use of the Windows API to perform its stated objectives. When loaded it changes it own priority to REALTIME to assure that it will not be locked out by the process that it is monitoring.
Guardian Angel has two modes of operation. In the “Process Presence” mode is checks for the existence of the specified process 20 times within the duration specified. If all 20 observations could not locate the process then action is taken.
In the “CPU Utilization” mode, it creates an event counter for the process specified in the input parameter. This counter is supported by Windows performance interface.
It creates a timer with an interval of 1/20th of the duration specifed in the input parameter. At each timer interval it samples the current value of the event counter and maintains a history of the last 20. This history is averaged and compared with the input threshold.
When the threshold is reached it wll send a message to the specified window requesting that the application that owns the window to close itself down. It will do this for all windows that have the text string input parameter contained in their title bar. It allow 2 seconds for each window to be signaled and 15 seconds for the application to perform its closedown actions. Any window that remains open will have its process terminated. It may be the case that a closedown request will popup a message box requesting user action. This box will disappear when the process is terminated so no interative actions are needed. After another 15 seconds the process in which the application is running could not be terminated then Windows will be commanded to restart. Before this restart occurs the above describe process of close messages followed by process termination, if required, is done for every window currently running on the computer.
Should a restart not be needed or requested, then Guardian Angel waits another 10 seconds and launches the application provided as an input parameter. No special provisions were provided to handle input parameters on this application so if some are required, the the application should actually be run from a vbs script and the script called from Guardian Angel.
After launching the application it will set the application’s process priority. If this was not done, then the application would run in REALTIME. After another 100 second wait Guardian Angel will start monitoring again.
None
None
1)
Place GuardianAngel.exe in the Homeseer root
folder. at any desired location on the computer. The examples used in this document assume
that it is located in the homeseer root directory.
2) Add an event to homeseer that runs periodically that creates the file \Data\GuardianAngel.pulse. The period will be more often than the monitoring duration setup for GuardianAngel. Every 2 minutes should be fine. The following entry on the Script event tab will accomplish this objective: &CreateObject(“Scripting.FileSystemObject”).CreateTextFile(hs.GetAppPath & “\Data\GuardianAngel.pulse”)
None
I used the following script file to test GuardianAngel operation. It sets a very low threshold and duration to assure that the trip condition will occur. The script loop will cause the trip. This script was called out of a manual homeseer event.
The homeseer log and Guardian Angel log can be observed. The homeseer or computer restart will also be quite noticable.
sub main()
'hs.launch hs.GetAppPath &
"\GuardianAngel.exe","homeseer|20|10"
hs.launch hs.GetAppPath &
"\GuardianAngel.exe","homeseer|20|10|homeseer|c:\Program
Files\Homeseer\Homeseer.exe|below_normal"
for i = 1 to 20
hs.writelog "wait",i
hs.waitsecs 1
next
end sub
To use it to monitor a processes presence then the following syntax may be appropriate. In this case notepad is monitored so it is restarted after it has been stopped for 10 seconds
Test.vbs
parm = "notepad.exe|0|10||notepad.exe|below_normal"
q = """"
CreateObject("Wscript.Shell").Run q
& "C:\Program Files\Homeseer\GuardianAngel.exe" & q &
" " & parm,0,0
Now open notepad and a few
seconds later close it. Guardian Angel
will reopen it 10 seconds later.
Monitoring will start only after it has detected the application has
started running or is already running.
After action is taken the process must again start before it will be
monitored.
None
http://homeseer.infopop.net/3/OpenTopic?a=tpc&s=697298074&f=5082900573&m=4482912735
Contains installUtilities & iniTool
Contains Header.asp, footer.asp, styles.asp
Contains DatabaseUtilities.inc
Contains Xlgraph.vbs, Xlgraph.ini
Contains Logit.txt
Contains spawn.txt
Contains ProcessPriority.exe
Contains homeseer.mdb
Contains hidden frame include
http://homeseer.infopop.net/3/OpenTopic?a=tpc&s=697298074&f=5082900573&m=2192993514
Contains motion.asp
Contains FileFromWeb to capture binary images/files from a web site
Contains modifications
required to Audrey to support AudreyUtilities
Contains MessageScript.txt
Contains AudreyUtilities.inc