[GE users] Wildcarded PE Name Circumvents Queue Sorting

templedf dan.templeton at sun.com
Tue Dec 1 19:50:02 GMT 2009


Richard,

The issue did not make it into the u5 release.  We'll see what we can do 
for u6.  Sorry.

Daniel

rems0 wrote:
> Hi list,
>
> is this bug (#3021) going to be fixed on 6.2u5?
> How are issues/bug priorities for fixing given?
> Does voting for an issue really help?   ;-)
>
> Please vote for issue 3021 !!!
> http://gridengine.sunsource.net/issues/showvotes.cgi?voteon=3021
>
> Can someone talk/write/ask an SGE developer?
> Or any developer answering?
>
>
> Many thanks, Richard
>
>
> On 10/21/2009 03:37 AM, cjf001 wrote:
>   
>> Hmm - I've seen something similar, I think - I submitted a bug report
>> back in May on this:
>>
>> http://gridengine.sunsource.net/issues/show_bug.cgi?id=3021
>>
>> I was also using wildcards, but not in the pe name - I was using
>> them in the queue name - so, not sure if it's the same issue
>> or not. This was never resolved, so I had to do a workaround
>> which consisted of using hostgroups in place of the wildcards.
>>
>> Probably doesn't help you much, but I'd certainly vote for a
>> rewrite of the "sge_select_parallel_environment" code !
>>
>>       John
>>
>> templedf wrote:
>>     
>>> I have another odd issue.  I have a test config with three queues, 
>>> test1, test2, and test3.  I also have three PEs, test_1, test_2, and 
>>> test_3.  Each queue has the corresponding PE in its pe_list, e.g. queue 
>>> test1 has PE test_1, etc.  I have the queue_sort_method set to "seqno", 
>>> and each queue has a seq_no equal to its name, e.g. queue test1 has a 
>>> seq_no of 1.  There is a single host in the cluster, and each queue has 
>>> 4 slots on that host.  The load_thresholds are set to NONE for all three 
>>> queues, and there are no other queues in the system.
>>>
>>> If I submit:
>>>
>>> qsub -t 1-2 sleeper.sh
>>> qsub -t 1-2 sleeper.sh
>>> qsub -t 1-2 sleeper.sh
>>>
>>> the behavior is as expected.  The first two jobs go to test1, and the 
>>> third goes to test2.  If, however, I submit:
>>>
>>> qsub -pe test_\* 2 sleeper.sh
>>> qsub -pe test_\* 2 sleeper.sh
>>> qsub -pe test_\* 2 sleeper.sh
>>>
>>> the behavior is undefined.  The jobs may land on any queues in any 
>>> order.  Looking at the schedd_runlog file, it looks like the wildcarded 
>>> PE is being used for the sort order of the queues, and the wildcard 
>>> statement doesn't always create the list of PEs in the same order.  For 
>>> example:
>>>
>>> Tue Oct 20 19:42:53 2009|-------------START-SCHEDULER-RUN-------------
>>> Tue Oct 20 19:42:53 2009|queue instance 
>>> "all.q at daniel-templetons-macbook-pro" dropped because it is disabled
>>> Tue Oct 20 19:42:53 2009|queues dropped because they are disabled: 
>>> all.q at daniel-templetons-macbook-pro
>>> Tue Oct 20 19:42:53 2009|Job 54 cannot run in queue "test1" because PE 
>>> "test_2" is not in pe list
>>> Tue Oct 20 19:42:53 2009|Job 54 cannot run in queue "test3" because PE 
>>> "test_2" is not in pe list
>>> Tue Oct 20 19:42:53 2009|Job 54 cannot run in queue "test2" because PE 
>>> "test_1" is not in pe list
>>> Tue Oct 20 19:42:53 2009|Job 54 cannot run in queue "test3" because PE 
>>> "test_1" is not in pe list
>>> Tue Oct 20 19:42:53 2009|Job 54 cannot run in queue "test1" because PE 
>>> "test_3" is not in pe list
>>> Tue Oct 20 19:42:53 2009|Job 54 cannot run in queue "test2" because PE 
>>> "test_3" is not in pe list
>>> Tue Oct 20 19:42:53 2009|Job 55 cannot run in queue "test1" because PE 
>>> "test_2" is not in pe list
>>> Tue Oct 20 19:42:53 2009|Job 55 cannot run in queue "test3" because PE 
>>> "test_2" is not in pe list
>>> Tue Oct 20 19:42:53 2009|Job 55 cannot run in queue "test2" because PE 
>>> "test_1" is not in pe list
>>> Tue Oct 20 19:42:53 2009|Job 55 cannot run in queue "test3" because PE 
>>> "test_1" is not in pe list
>>> Tue Oct 20 19:42:53 2009|Job 55 cannot run in queue "test1" because PE 
>>> "test_3" is not in pe list
>>> Tue Oct 20 19:42:53 2009|Job 55 cannot run in queue "test2" because PE 
>>> "test_3" is not in pe list
>>> Tue Oct 20 19:42:53 2009|queue instance 
>>> "test2 at daniel-templetons-macbook-pro" dropped because it is full
>>> Tue Oct 20 19:42:53 2009|queues dropped because they are full: 
>>> test2 at daniel-templetons-macbook-pro
>>> Tue Oct 20 19:42:53 2009|Job 56 cannot run in queue "test1" because PE 
>>> "test_2" is not in pe list
>>> Tue Oct 20 19:42:53 2009|Job 56 cannot run in queue "test3" because PE 
>>> "test_2" is not in pe list
>>> Tue Oct 20 19:42:53 2009|Job 56 cannot run in PE "test_2" because it 
>>> only offers 0 slots
>>> Tue Oct 20 19:42:53 2009|Job 56 cannot run in queue "test3" because PE 
>>> "test_1" is not in pe list
>>> Tue Oct 20 19:42:53 2009|Job 56 cannot run in queue "test1" because PE 
>>> "test_3" is not in pe list
>>> Tue Oct 20 19:42:53 2009|--------------STOP-SCHEDULER-RUN-------------
>>>
>>> In this case, the first two jobs went to test2, and the third went to test1.
>>>
>>> Anyone else seen this before?
>>>
>>> Aside from reporting the issue, I also need to find a way to get this 
>>> working.  What I'm trying to do is have three queues that all offer the 
>>> same PE that uses $fill_up behavior.  The queues should be loaded in 
>>> seq_no order, and no job should be allowed to span multiple queues.  
>>> Jobs must either fit entirely into a single queue, or they can't be 
>>> scheduled.  It's fine if a job spills across multiple hosts within a 
>>> queue, though.
>>>
>>> Any clever ideas on how to get the desired behavior?
>>>
>>> Thanks,
>>> Daniel
>>>
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=222465
>>>
>>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>>>       
>>     
>
>
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=230782

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list