[GE users] sched_job_info problems

reuti reuti at staff.uni-marburg.de
Thu Aug 6 11:06:32 BST 2009


Hi,

Am 06.08.2009 um 00:48 schrieb skylar2:

> We're running into problems with GE 6.1u6 where having schedd_job_info
> enabled makes sge_schedd eat up all the RAM on the system when we have
> users submit jobs with PE slot range requests. This appears to be an
> issue that is reported here and supposedly fixed:
>
> http://gridengine.sunsource.net/issues/show_bug.cgi?id=2187
>
> Disabling schedd_job_info makes the problem go away, but our users
> depend on schedd_job_info output to debug their jobs. Does anyone know
> of a workaround for this problem?

what dou you mean by "debug"? Usually it happens to users who are new  
to a cluster to request too much resources and they want to  
investigate why the jobs aren't starting. One test could be:

$ qalster -w v <jobid>

instead to get a reply whether there are any suitables queues at all,  
"-w p" to get a reply under the current load. In special cases I turn  
the schedd_job_info on for some minutes for investigation and turn it  
off again.

-- Reuti

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=211178

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list