[GE users] Jobs in "hold" state disappear. Debugging help?

reuti reuti at staff.uni-marburg.de
Wed Mar 24 20:24:56 GMT 2010


Am 24.03.2010 um 21:18 schrieb gutnik:

> I'm using a cad tool (cadence) that is somewhat integrated with SGE.
> In one situation, it launches a number of simulations, and one
> "cleanup" job. I see the simulation jobs submitted, and I see the
> cleanup job submitted with a hold that depends on the simulation
> jobs. Great.
>
> What I see is that the simulations finish, and about 30 seconds
> later, the cleanup job is removed from the queue without ever being
> run. So, I have two questions:

Removed from `qstat` or within Cadence?

>
> 1) The scheduling interval is 5 seconds, as the top two lines from
> qconf -ssconf show.
> algorithm                         default
> schedule_interval                 0:0:05
>
> Why does it take 25-40 seconds (I've timed it approximately) for the
> job to disappear, rather than 5 (or 10, say, if there's some
> fencepost issue.) qstat shows the job "in hold state" for the entire  
> time.
>
> 2) Why is the job being removed? One possibility (from the manual) is
> that the simulations are exiting with code 100. Is that

The it's not removed but the job put in error state.

To investigate this you can use:

$ qstat -s z

$ qacct -j <job_od_of_cleaner>

You can even define in a ~/.sge_request file that you can an email for  
each started/ended/aborted job (which will be attached to each job  
even when you have no chance to enter such option in your application).

-- Reuti


> the only possibility? Is there more logging I can see somewhere for
> which job returned with which code, or why the cleanup
> job was removed from the queue?
>
> Thanks.
>
> Vadim
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=251199
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net 
> ].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=251200

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list