GUARDIAN ANGEL

 

 

Homeseer Support for Monitoring Processes and Taking Recovery Actions

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

By Michael McSharry

September 11, 2003

1      Application / Script Interface. 2

1.1       Starting Guardian Angel 2

1.2       User Feedback. 4

1.3       Use Considerations. 4

1.4       Program Design. 4

1.5       Guardian Angel Install Script 5

2      WEB INTERFACE. 6

3      Install Instructions. 7

4      INI Definitions. 8

5      Useage / Test Instructions. 9

6      Resources Required. 10

6.1       Other Offerings. 10


 

GUARDIAN ANGEL

 

 

Guradian Angel is an executable that runs silently in a real time (highest) priority process and monitors the behavior of other processes or the computer overall.  The measurement currently observered is the CPU utilization over a period of time for a particular process or for the computer overall, or for the absence of a process that is intended to be running.  When the threshold is reached, or process disappears,  then specified recovery actions are performed.  The recovery actions are typically to terminate the offending process and restart it or to restart the computer.

 

Guardian Angel is a watchdog that will monitor a process without any interaction needed with the process and without any external hardware additions.  It will successfully detect a process that utilizes an excessive amount of CPU time and allow recovery from this situation.  It will do the recovery in a graceful manner by first asking the offending process to close itself down.  If not successful then it will terminate the process in which the application is running.  If that is not successful it will then ask all applications to shut themselves down, followed by termination of each if needed and this followed by restart of Windows.

 

If Homeseer is one of the processes that Guardian Angel observes then it will expect a heartbeat which is the continued recreation of the file "\Data\GuardianAngel.pulse".  If this file becomes stale then Guardian Angel will take the same recovery  action as if the process has gone into a mode of high cpu utilization.

 

Guardian Angel will not be successful at detecting a lockup condition in which CPU utilization is not an attribute of the lockup.  Guardian Angel has the advantage over a hardware watchdog circuit in that is will not abruptly stop a process without warning as is the case with external watchdogs.  It also has the advantage in that is can selectively terminate a single offending process without restarting windows and/or the computer.  It downside is that depending upon the integrity of the Windows OS and it is limited in the nature of failures that it will detect.

 

 

Table 1 Revision History

Rev

Date

Description

1.1

8-24-02

Allow application 15 seconds to close before terminating process.  Previously it was 1 second.

1.2

9-11-03

Retain PID for a window after initially determined

 

 

Add ability to detect when a process has stopped running

 

10-9-04

Add Pulse Monitoring

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1         Application / Script Interface

1.1      Starting Guardian Angel

GuardianAngel.exe is started as any application with input parameters.  From homeseer it can be launched from an event or launched from a script.  It can also be launched from Windows by including it in the Startup Group or other folder that can be accessed with a mouse double-click.

 

Guardian Angel optionally accepts a parameter string.  Each parameter in the string is separated by a “|”.  The following parameters, and there defaults are itemized:

 

Table 2 Guardian Angel Parameters

Description

Values

Default

1. Process to be Monitored

Process name as defined by Windows performance monitoring.  This will usually be the common name for the application.

An empty string ("") indicates for the total of all processes.

If the name contains a period (".") then presence of this process is being monitored rather than cpu utilization

“”

2. Threshold of CPU utilization

Integer measurement as a percentage of total CPU use. 

 

If application presence is being monitored then this value can be anything since it is not used

95 (percent)

3. Duration for the threshold

Number of seconds that the average utilization must be above the threshold for cpu monitoring

 

If application presence is being monitored then this is the number of consecutive seconds that the monitored process has been missing from the process list.

 

It is also used for Pulse monitoring.  The file …Homeseer\Data\GuardianAngel.Pulse must be updated within this period to avoid an action.

300  (seconds)

4. Window Title of process to be terminated

A text string containing either part of the application’s window title or the filename of the application.  A period “.” Contained in the string will determine if it is a filename or a window title.

 

If a title, then all applications with this title segment will be terminated. Use care if short strings are used because it may close down more windows than expected

 

“restart” is special case reserved word to indicate that the computer is to be restarted

 

If monitoring if for a missing process then this field can be anything since it is not used.

restart

5. Application to Launch

The action that will be taken 10 seconds after the termination action is taken.  Empty sting indicates no action.

“”

6. Priority in which to run the Launched application

REALTIME,

HIGH,

ABOVE_NORMAL,

NORMAL,

BELOW_NORMAL,

LOW

normal

 

The following are examples of methods in which to start Guardian Angel and how parameters can be formed:

 

  1. From a vbs script file that can be included in the startup group of windows or double-clicked from windows explorer.  In this case homeseer will be monitored for 90 % utilization over 500 seconds and to restart the computer when it happens:

parm = "homeseer|90|500|restart "

q = " " " "

CreateObject(“Wscript.Shell”).Run q & "C:\Program Files\Homeseer\GuardianAngel.exe " & q & “ “ & parm,0,0

 

  1. From a vbs script file that can be included in the startup group of windows or double-clicked from windows explorer.  In this case homeseer will be monitored if it disappears from process list over 500 seconds and to start the homeseer.exe when it happens:

parm = "homeseer.exe||500|| C:\Program Files\Homeseer 2\Homeseer.exe "

q = " " " "

CreateObject(“Wscript.Shell”).Run q & "C:\Program Files\Homeseer\GuardianAngel.exe " & q & “ “ & parm,0,0

 

  1. From a script within homeseer to monitor homeseer for 90% utilization over a 2 minute period and to restart homeseer when it occurs:

hs.launch hs.GetAppPath & "\GuardianAngel.exe ", "homeseer|90|120|homeseer.exe|C:\Program Files\Homeseer\Homeseer.exe|above_normal "

 

 

  1. From a command line or Windows Explorer without any parameters to monitor the computer for 95% utilization over a 5 minute period and to restart the computer when this occurs

C:\Program Files\Homeseer\GuardianAngel.exe

 

  1. Same as item 2, but to restart notepad.exe if it gets closed down for 0.5 seconds:

parm = "notepad.exe||500||notepad.exe "

q = " " " "

CreateObject("Wscript.Shell ").Run q & "C:\Program Files\Homeseer\GuardianAngel.exe” & q & " " & parm,0,0

 

1.2      User Feedback

Guardian Angel provides feedback in the homeseer log when it is started.  This provides the acknowledgement of what is being monitored and what will be done should the monitored situation arrise.  It is a formatted feedback of the input parameters.  It will be obvious by looking at the log if the correct monitoring conditions are recognized by Guardian Angel.

 

Guardian Angel also maintains its own log “GuardianAngel.log” in the App path \Data folder.  This will normally be C:\Program Files\Homeseer\Data when GuardianAngel.exe is located in the Homeseer root directory.  It will contain the same invocation feedback, as well as feedback on each event and action taken.

1.3      Use Considerations

Care should be taken to understand the nature of the process that is being monitored.  Iterative scripts are either CPU hungry in their own right, or they make use of waitsec/waitevent with has the effect of charging homeseer for the time during the wait.  I have some CPU intensive applications and these are all run in processes separate from homeseer.  While the CPU utilization is high, the homeseer utilization numbers remain low.

 

The capability is provided with Guardian Angel to terminate selective processes.  What one does not know, however, is what bad things the offending process has done to not only itself, but to other resources running on the computer.  For mission critical applications it is probably best to restart the computer when the condition is detected.

 

I do not have a platform that generates high cpu utilization lockups so I really do not know what the threshold / duration profile should look like.  Likewise I could not actually evaluate it on a computer that actually did go into this undesired state.

1.4      Program Design

 

Guardian Angel makes extensive use of the Windows API to perform its stated objectives.  When loaded it changes it own priority to REALTIME to assure that it will not be locked out by the process that it is monitoring.

 

Guardian Angel has two modes of operation.  In the “Process Presence” mode is checks for the existence of the specified process 20 times within the duration specified.  If all 20 observations could not locate the process then action is taken.

 

In the “CPU Utilization” mode, it creates an event counter for the process specified in the input parameter.  This counter is supported by Windows performance interface.

It creates a timer with an interval of 1/20th of the duration specifed in the input parameter.  At each timer interval it samples the current value of the event counter and maintains a history of the last 20.  This history is averaged and compared with the input threshold.

 

When the threshold is reached it wll send a message to the specified window requesting that the application that owns the window to close itself down.  It will do this for all windows that have the text string input parameter contained in their title bar.   It allow 2 seconds for each window to be signaled and 15 seconds for the application to perform its closedown actions.  Any window that remains open will have its process terminated.  It may be the case that a closedown request will popup a message box requesting user action.  This box will disappear when the process is terminated so no interative actions are needed.  After another 15 seconds the process in which the application is running could not be terminated then Windows will be commanded to restart.  Before this restart occurs the above describe process of close messages followed by process termination, if required, is done for every window currently running on the computer.

 

Should a restart not be needed or requested, then Guardian Angel waits another 10 seconds and launches the application provided as an input parameter.  No special provisions were provided to handle input parameters on this application so if some are required, the the application should actually be run from a vbs script and the script called from Guardian Angel.

 

After launching the application it will set the application’s process priority.  If this was not done, then the application would run in REALTIME.  After another 100 second wait Guardian Angel will start monitoring again.

1.5      Guardian Angel Install Script

None

2         WEB INTERFACE

None

3         Install Instructions

 

1)      Place GuardianAngel.exe in the Homeseer root folder. at any desired location on the computer.  The examples used in this document assume that it is located in the homeseer root directory.

2)       Add an event to homeseer that runs periodically that creates the file \Data\GuardianAngel.pulse.  The period will be more often than the monitoring duration setup for GuardianAngel.  Every 2 minutes should be fine.  The following entry on the Script event tab will accomplish this objective: &CreateObject(“Scripting.FileSystemObject”).CreateTextFile(hs.GetAppPath & “\Data\GuardianAngel.pulse”)

4         INI Definitions

None

5         Useage / Test Instructions

 

I used the following script file to test GuardianAngel operation.  It sets a very low threshold and duration to assure that the trip condition will occur.  The script loop will cause the trip.  This script was called out of a manual homeseer event.

 

The homeseer log and Guardian Angel log can be observed.  The homeseer or computer restart will also be quite noticable.

 

sub main()

 

'hs.launch hs.GetAppPath & "\GuardianAngel.exe","homeseer|20|10"

 

hs.launch hs.GetAppPath & "\GuardianAngel.exe","homeseer|20|10|homeseer|c:\Program Files\Homeseer\Homeseer.exe|below_normal"

 

for i = 1 to 20

  hs.writelog "wait",i

  hs.waitsecs 1

next

 

end sub

 

 

To use it to monitor a processes presence then the following syntax may be appropriate.  In this case notepad is monitored so it is restarted after it has been stopped for 10 seconds

 

Test.vbs

 

parm = "notepad.exe|0|10||notepad.exe|below_normal"

q = """"

CreateObject("Wscript.Shell").Run q & "C:\Program Files\Homeseer\GuardianAngel.exe" & q & " " & parm,0,0

 

Now open notepad and a few seconds later close it.  Guardian Angel will reopen it 10 seconds later.  Monitoring will start only after it has detected the application has started running or is already running.  After action is taken the process must again start before it will be monitored.

6         Resources Required

None

 

 

6.1      Other Offerings

 

http://homeseer.infopop.net/3/OpenTopic?a=tpc&s=697298074&f=5082900573&m=4482912735

  Contains installUtilities & iniTool

  Contains Header.asp, footer.asp, styles.asp

  Contains DatabaseUtilities.inc

  Contains Xlgraph.vbs, Xlgraph.ini

Contains Logit.txt

Contains spawn.txt

http://homeseer.infopop.net/3/OpenTopic?a=tpc&s=697298074&f=5082900573&m=4642914705&r=4642914705#4642914705

  Contains ProcessPriority.exe

  Contains homeseer.mdb

 

  Contains hidden frame include

http://homeseer.infopop.net/3/OpenTopic?a=tpc&s=697298074&f=5082900573&m=2192993514

  Contains motion.asp

  Contains FileFromWeb to capture binary images/files from a web site

  Contains modifications required to Audrey to support AudreyUtilities

  Contains MessageScript.txt

http://homeseer.infopop.net/3/OpenTopic?a=tpc&s=697298074&f=726296174&m=3812973606&r=2092926416#2092926416

  Contains AudreyUtilities.inc