[GE users] Wildcarded PE Name Circumvents Queue Sorting

templedf dan.templeton at sun.com
Wed Dec 2 14:38:22 GMT 2009


u5 will be out next month.  There's no roadmap online, but the general 
plan is a release roughly every 3 months, with every other release 
containing new features.  That means you should probably expect u6 
sometime around March, and it will just be a bugfix release.

Daniel

rems0 wrote:
> Hi Daniel,
>
> it would be *** REALLY GREAT *** to have this issue fixed soon!
>
> Is u5 coming out soon?
> I only managed to upgrade to u4 some weeks ago! Is there a kind of
> roadmap for GE releases available online?
>
> Thanks, Richard
>
> On 12/01/2009 08:50 PM, templedf wrote:
>   
>> Richard,
>>
>> The issue did not make it into the u5 release.  We'll see what we can do 
>> for u6.  Sorry.
>>
>> Daniel
>>
>> rems0 wrote:
>>     
>>> Hi list,
>>>
>>> is this bug (#3021) going to be fixed on 6.2u5?
>>> How are issues/bug priorities for fixing given?
>>> Does voting for an issue really help?   ;-)
>>>
>>> Please vote for issue 3021 !!!
>>> http://gridengine.sunsource.net/issues/showvotes.cgi?voteon=3021
>>>
>>> Can someone talk/write/ask an SGE developer?
>>> Or any developer answering?
>>>
>>>
>>> Many thanks, Richard
>>>
>>>
>>> On 10/21/2009 03:37 AM, cjf001 wrote:
>>>   
>>>       
>>>> Hmm - I've seen something similar, I think - I submitted a bug report
>>>> back in May on this:
>>>>
>>>> http://gridengine.sunsource.net/issues/show_bug.cgi?id=3021
>>>>
>>>> I was also using wildcards, but not in the pe name - I was using
>>>> them in the queue name - so, not sure if it's the same issue
>>>> or not. This was never resolved, so I had to do a workaround
>>>> which consisted of using hostgroups in place of the wildcards.
>>>>
>>>> Probably doesn't help you much, but I'd certainly vote for a
>>>> rewrite of the "sge_select_parallel_environment" code !
>>>>
>>>>       John
>>>>
>>>> templedf wrote:
>>>>     
>>>>         
>>>>> I have another odd issue.  I have a test config with three queues, 
>>>>> test1, test2, and test3.  I also have three PEs, test_1, test_2, and 
>>>>> test_3.  Each queue has the corresponding PE in its pe_list, e.g. queue 
>>>>> test1 has PE test_1, etc.  I have the queue_sort_method set to "seqno", 
>>>>> and each queue has a seq_no equal to its name, e.g. queue test1 has a 
>>>>> seq_no of 1.  There is a single host in the cluster, and each queue has 
>>>>> 4 slots on that host.  The load_thresholds are set to NONE for all three 
>>>>> queues, and there are no other queues in the system.
>>>>>
>>>>> If I submit:
>>>>>
>>>>> qsub -t 1-2 sleeper.sh
>>>>> qsub -t 1-2 sleeper.sh
>>>>> qsub -t 1-2 sleeper.sh
>>>>>
>>>>> the behavior is as expected.  The first two jobs go to test1, and the 
>>>>> third goes to test2.  If, however, I submit:
>>>>>
>>>>> qsub -pe test_\* 2 sleeper.sh
>>>>> qsub -pe test_\* 2 sleeper.sh
>>>>> qsub -pe test_\* 2 sleeper.sh
>>>>>
>>>>> the behavior is undefined.  The jobs may land on any queues in any 
>>>>> order.  Looking at the schedd_runlog file, it looks like the wildcarded 
>>>>> PE is being used for the sort order of the queues, and the wildcard 
>>>>> statement doesn't always create the list of PEs in the same order.  For 
>>>>> example:
>>>>>
>>>>> Tue Oct 20 19:42:53 2009|-------------START-SCHEDULER-RUN-------------
>>>>> Tue Oct 20 19:42:53 2009|queue instance 
>>>>> "all.q at daniel-templetons-macbook-pro" dropped because it is disabled
>>>>> Tue Oct 20 19:42:53 2009|queues dropped because they are disabled: 
>>>>> all.q at daniel-templetons-macbook-pro
>>>>> Tue Oct 20 19:42:53 2009|Job 54 cannot run in queue "test1" because PE 
>>>>> "test_2" is not in pe list
>>>>> Tue Oct 20 19:42:53 2009|Job 54 cannot run in queue "test3" because PE 
>>>>> "test_2" is not in pe list
>>>>> Tue Oct 20 19:42:53 2009|Job 54 cannot run in queue "test2" because PE 
>>>>> "test_1" is not in pe list
>>>>> Tue Oct 20 19:42:53 2009|Job 54 cannot run in queue "test3" because PE 
>>>>> "test_1" is not in pe list
>>>>> Tue Oct 20 19:42:53 2009|Job 54 cannot run in queue "test1" because PE 
>>>>> "test_3" is not in pe list
>>>>> Tue Oct 20 19:42:53 2009|Job 54 cannot run in queue "test2" because PE 
>>>>> "test_3" is not in pe list
>>>>> Tue Oct 20 19:42:53 2009|Job 55 cannot run in queue "test1" because PE 
>>>>> "test_2" is not in pe list
>>>>> Tue Oct 20 19:42:53 2009|Job 55 cannot run in queue "test3" because PE 
>>>>> "test_2" is not in pe list
>>>>> Tue Oct 20 19:42:53 2009|Job 55 cannot run in queue "test2" because PE 
>>>>> "test_1" is not in pe list
>>>>> Tue Oct 20 19:42:53 2009|Job 55 cannot run in queue "test3" because PE 
>>>>> "test_1" is not in pe list
>>>>> Tue Oct 20 19:42:53 2009|Job 55 cannot run in queue "test1" because PE 
>>>>> "test_3" is not in pe list
>>>>> Tue Oct 20 19:42:53 2009|Job 55 cannot run in queue "test2" because PE 
>>>>> "test_3" is not in pe list
>>>>> Tue Oct 20 19:42:53 2009|queue instance 
>>>>> "test2 at daniel-templetons-macbook-pro" dropped because it is full
>>>>> Tue Oct 20 19:42:53 2009|queues dropped because they are full: 
>>>>> test2 at daniel-templetons-macbook-pro
>>>>> Tue Oct 20 19:42:53 2009|Job 56 cannot run in queue "test1" because PE 
>>>>> "test_2" is not in pe list
>>>>> Tue Oct 20 19:42:53 2009|Job 56 cannot run in queue "test3" because PE 
>>>>> "test_2" is not in pe list
>>>>> Tue Oct 20 19:42:53 2009|Job 56 cannot run in PE "test_2" because it 
>>>>> only offers 0 slots
>>>>> Tue Oct 20 19:42:53 2009|Job 56 cannot run in queue "test3" because PE 
>>>>> "test_1" is not in pe list
>>>>> Tue Oct 20 19:42:53 2009|Job 56 cannot run in queue "test1" because PE 
>>>>> "test_3" is not in pe list
>>>>> Tue Oct 20 19:42:53 2009|--------------STOP-SCHEDULER-RUN-------------
>>>>>
>>>>> In this case, the first two jobs went to test2, and the third went to test1.
>>>>>
>>>>> Anyone else seen this before?
>>>>>
>>>>> Aside from reporting the issue, I also need to find a way to get this 
>>>>> working.  What I'm trying to do is have three queues that all offer the 
>>>>> same PE that uses $fill_up behavior.  The queues should be loaded in 
>>>>> seq_no order, and no job should be allowed to span multiple queues.  
>>>>> Jobs must either fit entirely into a single queue, or they can't be 
>>>>> scheduled.  It's fine if a job spills across multiple hosts within a 
>>>>> queue, though.
>>>>>
>>>>> Any clever ideas on how to get the desired behavior?
>>>>>
>>>>> Thanks,
>>>>> Daniel
>>>>>
>>>>> ------------------------------------------------------
>>>>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=222465
>>>>>
>>>>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>>>>>       
>>>>>           
>>>>     
>>>>         
>>>
>>>       
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=230782
>>
>> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>>
>>     
>
>
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=230973

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list