[GE users] Reason for job abort in SGE emails

Reuti reuti at staff.uni-marburg.de
Tue Jul 8 18:59:57 BST 2008


Hi,

as an update to the mail-wrapper I posted some days ago: it seems,  
that some of the jobs don't have the TASK_ID attached. Hopefully this  
wrapper gets more of the error messages:


#!/bin/sh

JOB_ID=`echo "$2" | cut -d " " -f 2`
CONDITION=`echo "$2" | cut -d " " -f 4`

# Replace with your location to your messages files of the nodes
appendix=`egrep "[|]job $JOB_ID([.][[:digit:]]+)? exceed" /var/spool/ 
sge/$HOSTNAME/messages | head -n 1`
if [ -z "$appendix" ]; then
     appendix="Unknown, no entry found in messages file on the master  
node of the job."
fi

if [ "$CONDITION" = "Aborted" ]; then
     (cat; echo; echo "Reason for job abort:"; echo $appendix) | mail  
-s "$2" "$3"
else
     mail -s "$2" "$3"
fi


-- Reuti

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list