Opened 9 years ago
Last modified 9 years ago
#1317 new defect
Event for "unheard": send email
Reported by: | Reuti | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | sge | Version: | 6.2u5 |
Severity: | minor | Keywords: | monitoring |
Cc: |
Description
When an exechost goes into "unheard" state, an email should be send to the administrator who is configured in SGE. It can be done in a cron job checking qhost or qstat -f for sure, but as the event took already place in SGE, why not also send an email like it's already done for crahsed jobs.
Note: See
TracTickets for help on using
tickets.
This should batch checks on all hosts to avoid mail storms in instances like
the file server for a stateless cluster going down.
#1322 is related.