[GE users] Queue on Error state

Reuti reuti at staff.uni-marburg.de
Tue Jul 29 12:30:02 BST 2008


Hi,

Am 29.07.2008 um 13:18 schrieb Fco. Javier Modrego:

> Frequently I found my queues in error state and new jobs cannot  
> start. An example of the error messages are below but the problem  
> seems to be (I think...)  that a preceeding job has erased all the  
> content (files and subdirectories) of the local spooling directory  
> at the computing nodes (/tmp/sge in my installation) and the  
> spooling files for  new jobs cannot be created. As far as the  
> problem I think that I understand what is happening but I have no  
> clue how to solve it.
> My main suspects are parallel Turbomole jobs but I cannot find  
> nothing in their scripts which can justify this behaviour. I would  
> be grateful if anybody with experience integrating Turbomole in SGE  
> could give my a hand...May be a symbol in conflict with SGE and an  
> assasin "rm"... I have no clue
> Also clearing the error state does not reduce just to using qmod - 
> cq... as it doesn't work straight away. The daemons in the node are  
> running and must be killed and then the queue stopped and started...

which version of Turbomole?

-- Reuti


> 	Thanks in advance
> 	F.J. Modrego
>
>
> Note: the installed version of SGE is 6.1u4
>
>
>
> 07/29/2008 05:59:29|qmaster|ml350|W|job 1304.1 failed on host  
> nodo01.localdomain general assumedly before job because: can't  
> create directory active_
> jobs/1304.1: No such file or directory
> 07/29/2008 05:59:29|qmaster|ml350|W|rescheduling job 1304.1
> 07/29/2008 05:59:29|qmaster|ml350|E|queue larga marked QERROR as  
> result of job 1304's failure at host nodo01.localdomain
> 07/29/2008 05:59:29|qmaster|ml350|W|queue  
> "larga at nodo01.localdomain" is marked QERROR
>
> -- 
>  Dr. F.J. Modrego
>  Department of Inorganic Chemistry
>  Facultad de Ciencias
>  University of Zaragoza
>  50009 ZARAGOZA
>  SPAIN
>  Tel <34>-976-762288
>  Fax <34>-976-761187
>  E-mail:  modrego at unizar.es
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list