[GE users] sheperd exited with exit status = 8

Vladimir Vuksan vlists at veus.hr
Wed Apr 13 21:57:45 BST 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

I am submitting a simple sleep job via qsub ie.

qsub ./sleep.sh

where sleep.sh contains

#!/bin/sh
sleep 1000

Job executes on an Opteron system (we have only one) with no problems 
however on the i386 boxes job fails with

04/13/2005 14:47:22|qmaster|monsterdell|W|job 27.1 failed on host 
titan.domain general in prolog because: 04/13/2005 14:47:21 [0:27642]: 
exit_status of prolog = 127
04/13/2005 14:47:22|qmaster|monsterdell|W|rescheduling job 27.1
04/13/2005 14:47:22|qmaster|monsterdell|E|queue titan.q marked QERROR as 
result of job 27's failure at host titan.domain

If I take a look at the execd log it says

04/13/2005 14:47:21|execd|titan|E|shepherd of job 27.1 exited with exit 
status = 8
04/13/2005 14:47:21|execd|titan|W|reaping job "27" ptf complains: Job 
does not exist

Sge_prolog contains following

#!/bin/sh

if [ ! -k /scratch ]; then
        exit 20
fi

if [ ! -e /usr/bin/ruby ]; then
        exit 30
fi

if [ `ls -l /etc/alternatives/awk | grep -c gawk` -ne 1 ]; then
        exit 40
fi

echo ""
echo "SGE 
##########################################################################"
echo "SGE                       Job prolog inserted by SGE"
echo "SGE Job started on `/bin/hostname -f` at `date "+%Y-%m-%d %H:%M:%S"`"
echo "SGE 
##########################################################################"

Any clues ?

Thanks,

Vladimir

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list