[GE users] SGE6 does not backfill

Stephan Grell - Sun Germany - SSG - Software Engineer stephan.grell at sun.com
Wed Apr 13 13:24:02 BST 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]



Juha Jäykkä wrote:

>>The reports which we got were on 6.0u3 and u4. And it were always linux
>>on amd64 machines.
>>    
>>
>
>We have linux 2.4. Can I change the file descriptor limit? Does sge
>inherit it from the shell which starts it or something? I hope it does not
>use some hard-coded, unchangeable value.
>
It inherits them from the shell.

>
>  
>
>>>If the job goes to unknown state, could I not use reschedule_unknown to
>>>reschedule it after it fails?
>>>      
>>>
>>Hm, good idea. We should try it.
>>    
>>
>
>Tried this, no use. Did not help.
>
>  
>
>>qping -dump master_host $SGE_QMASTER_PORT qmaster 1
>>
>>This way you will the all communications between the qmaster and the
>>clients (scheduler, execd,..)
>>    
>>
>
>Ok, I did this. I attached the qping log, but looking at it I could not
>make heads or tails out of it. I hope you can. During the whole time of
>the qping run, I submitted 10 large (whole cluster) parallel jobs, out of
>which 4 failed.
>
I will take a look at them. Do you have the hosts for me on which the
job failed? That would be a very important information. The qmaster
messages file would also be nice.

>
>This is a rather frustrating bug. It does not undermine the whole
>package, but makes it quite unreliable.
>
>Is there some other debugging info I can provide you? Running some of the
>daemons with some debug-switches perhaps?
>
>  
>
>------------------------------------------------------------------------
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>For additional commands, e-mail: users-help at gridengine.sunsource.net
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list