[GE users] Jobs waiting due to loss of ressources

reuti reuti at staff.uni-marburg.de
Tue Dec 29 21:05:46 GMT 2009


Am 29.12.2009 um 19:20 schrieb fanou:

>> "Load value" just means it's a value that was reported from a load
>> sensor.  I agree that it sounds like you have a load sensor that's
>> misbehaving.  The load sensor for a global value would not be set  
>> in the
>> global host config (qconf -sconf).  It would instead be set for one
>> specific host.  (Yes, it sounds illogical, but it actually makes  
>> sense
>> if you think it through.)  Check the "qconf -sconf <host>" output for
>> all your machines, e.g.
>>
>> for host in `qconf -sel`; do
>>   echo $host
>>   qconf -sconf host | grep load_sensor
>> done
>>
>> And when you find it, write yourself a note so that you don't have  
>> to go
>> looking for it again in the future. :)
>>
>
> True !
> qconf -sconf master.cluster reports a load sensor.
> This script file uses some environment variables whom disappeared...
> No, I have to find how to set them without restarting the system  
> and give them to my 'sgeadmin' account who run sge_qmaster and  
> sge_schedd.

When the sgeexecd ist started during boot, maybe these environment  
variables aren't set. But when you stop/start the execd (just on this  
particular node with the defined load sensor script) from the command  
line, it might be defined and so available to the script as the  
enviroment is inherited by default.

- Reuti


> Thanks for your help !
>
> -- 
> Fanou
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=235442
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=235465

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list