[GE users] Qalter -w v bug?

Reuti reuti at staff.uni-marburg.de
Tue Apr 29 22:22:54 BST 2008


Am 29.04.2008 um 23:07 schrieb Heywood, Todd:

> Thanks, Reuti, that clears up some confusion I had with tmpfree,  
> which is
> also used here.
>
> But with the global home2load, the thing is that it works fine even  
> without
> the initial value. I just get complaints that it is an unknown  
> resource when
> I use qalter -w v. I guess the conclusion is that qalter -w v is not a
> complete substitution for having schedd_job_info set to true?

Yep, at least not for now; a "-w V" is proposed to honor the actual  
cluster utilization:

http://gridengine.sunsource.net/issues/show_bug.cgi?id=2548

-- Reuti


> Todd
>
>
> On 4/29/08 3:21 PM, "Reuti" <reuti at staff.uni-marburg.de> wrote:
>
>> Am 29.04.2008 um 20:11 schrieb Heywood, Todd:
>>
>>> On 4/29/08 1:39 PM, "Reuti" <reuti at staff.uni-marburg.de> wrote:
>>>
>>>> Am 29.04.2008 um 16:32 schrieb Heywood, Todd:
>>>>
>>>>> Here's something strange. If schedd_job_info is set to "true" in
>>>>> sched_conf,
>>>>> my load sensors and resource requests work just fine. For example,
>>>>> for a job
>>>>> asking for a non-available resource, "qstat -j 5581491" shows:
>>>>>
>>>>> (-l home2load=1) cannot run globally because it offers only
>>>>> gl:home2load=2.530000
>>>>
>>>> The relation is <= ?
>>>
>>> The relation in the complex is >=:
>>>
>>> home2load           home2load     DOUBLE      >=    YES
>>> NO         0
>>> 0
>>>
>>> Job is not supposed to run until homeload is <= 1 (in this
>>> example). This
>>> has been working fine for awhile.
>>>
>>>>
>>>>> But if I change schedd_job_info to "false", and use "qalter -w v
>>>>> 5581491", I
>>>>> get complaints that the resource is unknown:
>>>>>
>>>>> Job 5581491 (-l home2load=1) cannot run in queue  
>>>>> "public.q at blade49"
>>>>> because
>>>>> job requests unknown resource (home2load)
>>>>>
>>>>> (message occurs for all hosts, not just this one).
>>>>
>>>> Mmh, qalter -w v will assume an empty cluster. Is there any initial
>>>> value in the "qconf -se global" for home2load?
>>
>> Load values are ignored with qalter -w v, as the cluster is assumed
>> to be empty anyway (might change in future SGE versions).
>>
>>> No initial value. I thought that was only for consumables.
>>
>> I thought the same for a long time and never hit any problem using
>> http://gridengine.sunsource.net/howto/loadsensor.html while
>> forgetting to make tmpfree consumable. But we use tmpfree only as a
>> load_threshold and this was working fine. But if we would request "-l
>> tmpfree=1G" (still not consumable), then you need an initial value in
>> "qconf -me global" as recent posts on the list made me aware of. Your
>> issue seems to be similar.
>>
>> Nothing in global:
>>
>> reuti at theochem:~> qalter -w v 66368
>> Job 66368 (-l tmpfree=200G) cannot run in queue instance
>> "short at node41" because job requests unknown resource (tmpfree)
>> ...(all hosts)...
>>
>> Defined in global:
>>
>> reuti at theochem:~> qalter -w v 66368
>> Job 66368 (-l tmpfree=200G) cannot run globally because it offers
>> only gf:tmpfree=40.000G
>>
>> -- Reuti
>>
>>
>>> [root at bhmnode2 n1ge6]# qconf -se global
>>> hostname              global
>>> load_scaling          NONE
>>> complex_values        NONE
>>> load_values            
>>> home1load=0.00,home2load=1.33,home3load=20.83, \
>>>                       home4load=0.02,home5load=0.00
>>> processors            0
>>> user_lists            NONE
>>> xuser_lists           NONE
>>> projects              NONE
>>> xprojects             NONE
>>> usage_scaling         NONE
>>> report_variables
>>> cpu,h_vmem,mem_free,np_load_avg,s_vmem,virtual_free, \
>>>                       tmp_free
>>> [root at bhmnode2 n1ge6]
>>>
>>>
>>>>
>>>> -- Reuti
>>>>
>>>> ------------------------------------------------------------------- 
>>>> --
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users- 
>>>> help at gridengine.sunsource.net
>>>>
>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list