[GE users] SGE 6 - queues entering error state

Bevan C. Bennett bevan at fulcrummicro.com
Thu Aug 10 23:56:01 BST 2006


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Reuti wrote:
> Am 10.08.2006 um 22:32 schrieb Bevan C. Bennett:
> 
>> Reuti wrote:
>>> Am 10.08.2006 um 20:24 schrieb Bevan C. Bennett:
>>>
>>>>
>>>>> can you please post your queue, sge and exechost configuration.
>>>>
>>>> Which parts of it?
>>>
>>> The first few lines form the SGE conf, where the spool directories are
>>> defined, and maybe you have a local configuration for some of your hosts
>>> (qconf -sconfl)? And prolog/epilog in any of them?
>>
>> It's pretty basic...
>> [bevan at alexander ~]$ qconf -sconf
>> global:
>> execd_spool_dir              /usr/local/grid-6.0/default/spool
> 
> But this is different from the below mentioned:
> 
> /mnt/local/common/grid-test/default/spool/cobalt/active_jobs/2313.1/pid

Actually it's not different.
/usr/local -> /mnt/local/OS
/mnt/local/OS/grid-6.0 -> /mnt/local/common/grid-test

The latter was just showing off that it had figured out where it really was.

> 
>> shell_start_mode             posix_compliant
> 
> Often "unix_behavior" is easier for the users, but depends of course on
> your needs.

It's just something I don't want to change until I can figure this out.
Any ideas?
Is there a way to at least keep jobs that error-out the queue from requeueing
themselves and taking out the whole system?

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list