[GE users] Jobs in "hold" state disappear. Debugging help?

gutnik gutnik at gmail.com
Wed Mar 24 20:18:18 GMT 2010


I'm using a cad tool (cadence) that is somewhat integrated with SGE.
In one situation, it launches a number of simulations, and one
"cleanup" job. I see the simulation jobs submitted, and I see the
cleanup job submitted with a hold that depends on the simulation
jobs. Great.

 What I see is that the simulations finish, and about 30 seconds
later, the cleanup job is removed from the queue without ever being
run. So, I have two questions:

1) The scheduling interval is 5 seconds, as the top two lines from
qconf -ssconf show.
algorithm                         default
schedule_interval                 0:0:05

Why does it take 25-40 seconds (I've timed it approximately) for the
job to disappear, rather than 5 (or 10, say, if there's some
fencepost issue.) qstat shows the job "in hold state" for the entire time.

2) Why is the job being removed? One possibility (from the manual) is
that the simulations are exiting with code 100. Is that
the only possibility? Is there more logging I can see somewhere for
which job returned with which code, or why the cleanup
job was removed from the queue?

Thanks.

  Vadim

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=251199

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list