[GE users] policy conflict ?

cjf001 john.foley at motorola.com
Sat Sep 5 02:27:57 BST 2009


Ok, thanks - I'm trying that now and it seems to be working - will know
more after the weekend and the users fill up the queues. The messages
have stopped in the qmaster's messages file, though !

     John


reuti wrote:

> Am 05.09.2009 um 01:05 schrieb cjf001:
> 
> 
>>Well, maybe....
>>
>>I already have different sequence numbers on the queues.
>>
>>How would your "b" suggestion "change the equation" ?
> 
> 
> Once a PE is selected, you will get slots only from this one. Hence  
> all will be primary or all will be secondary, but no mix. So it can't  
> happen any longer, that a job would get two slots on one and the same  
> host but from different queues.
> 
> Usually I avoid to attach the same PE to different queues for another  
> reason: because the $TMPDIR will also have different names then as  
> the queuename is part of the $TMPDIR name. Many parallel applications  
> use the name which the master process of the parallel job got also  
> for the spread slave processes, and when the slaves can't find it on  
> their node, they crash.
> 
> -- Reuti
> 
> 
> 
>>Why is the scheduler apparently "giving up" and skipping the
>>last few jobs ?  It is running out of time, or does it figure
>>that it will just keep running into errors ?
>>
>>
>>    Thanks !
>>
>>      John
>>
>>
>>reuti wrote:
>>
>>
>>>Am 05.09.2009 um 00:43 schrieb cjf001:
>>>
>>>
>>>
>>>>Yes, at least some of them are. Is that a problem ? What
>>>>policy is this in conflict with ?
>>>
>>>
>>>SGE is just collecting slots from available queues - then it
>>>discovers that the job would block itself. What you can try is:
>>>
>>>a) use different sequence numbers for the primary and secondary queue
>>>
>>>b) duplicate the PE and name it mpich1 and mpich2 or alike, attach
>>>each to one and only one queue and request mpich* as pe
>>>
>>>HTH - Reuti
>>>
>>>
>>>
>>>
>>>>I found some info about this at :
>>>>http://gridengine.sunsource.net/issues/show_bug.cgi?id=437
>>>>
>>>>Unfortunately, this says :
>>>>  "Extensive discussion of the topic can be found under
>>>>  http://gridengine.sunsource.net/servlets/BrowseList?
>>>>list=users&by=thread&from=944"
>>>>
>>>>which sounds very promising, but that url is not found anymore.
>>>>
>>>>
>>>>  Thanks,
>>>>
>>>>     John
>>>>
>>>>
>>>>reuti wrote:
>>>>
>>>>
>>>>>Am 05.09.2009 um 00:12 schrieb cjf001:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>Guys, I've got a problem on SGE v6.2u2 that just showed up
>>>>>>yesterday,
>>>>>>as far as I can tell - I'm getting the following in the qmaster's
>>>>>>messages file:
>>>>>>
>>>>>>09/04/2009 17:05:24|schedu|lxadml2|W|Jobs 12873 & 12873 dispatched
>>>>>>to master/subordinated queues
>>>>>>"primary at lxdel20.srl.css.mot.com"/"secondary at lxdel20.srl.css.mot.c 
>>>>>>om
>>>>>>".
>>>>>>Suspend on subordinate to occur in same scheduling
>>>>>>interval. Policy conflict!
>>>>>>
>>>>>>... this repeats with a few more jobs, and then ...
>>>>>>
>>>>>>09/04/2009 17:05:24|worker|lxadml2|W|Skipping remaining 7 orders
>>>>>
>>>>>
>>>>>Is it a parallel job which might get slots from both queues - the
>>>>>superordinated and subordinated - at the same time?
>>>>>
>>>>>-- Reuti
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>note that the two jobs mentioned are the same job number, but this
>>>>>>is not
>>>>>>always the case.
>>>>>>
>>>>>>
>>>>>>Any idea what's going on here ?  The problem this is causing is  
>>>>>>that
>>>>>>all the jobs are not getting assigned to a queue, even though  
>>>>>>there
>>>>>>are open resources. Also, the qstat listing shows many jobs  
>>>>>>with "0"
>>>>>>priority, apparently because they are being "skipped" and have  
>>>>>>never
>>>>>>been viewed yet by the scheduler.
>>>>>>
>>>>>>Any help from the programmers greatly appreciated ! I'll do some
>>>>>>searching
>>>>>>on "policy conflict" in the meantime.....
>>>>>>
>>>>>>    John
>>>>>>
>>>>>>
>>>>>>
>>>>>>-- 
>>>>>>################################################################## 
>>>>>>##
>>>>>>##
>>>>>>#####
>>>>>># John Foley                          # Location:  IL93-
>>>>>>E1-21S            #
>>>>>># IT & Systems Administration         # Maildrop:  IL93-
>>>>>>E1-35O            #
>>>>>># Antenna & Mechanical Simulation Grp #    Email:
>>>>>>john.foley at motorola.com #
>>>>>># Motorola, Inc. -  Mobile Devices    #    Phone: (847)
>>>>>>523-8719          #
>>>>>># 600 North US Highway 45             #      Fax: (847)
>>>>>>523-5767          #
>>>>>># Libertyville, IL. 60048  (USA)      #     Cell: (847)
>>>>>>460-8719          #
>>>>>>################################################################## 
>>>>>>##
>>>>>>##
>>>>>>#####
>>>>>>               (this email sent using Mozilla on Windows)
>>>>>>
>>>>>>------------------------------------------------------
>>>>>>http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>>>dsForumId=38&dsMessageId=215831
>>>>>>
>>>>>>To unsubscribe from this discussion, e-mail: [users-
>>>>>>unsubscribe at gridengine.sunsource.net].
>>>>>
>>>>>
>>>>>------------------------------------------------------
>>>>>http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>>dsForumId=38&dsMessageId=215832
>>>>>
>>>>>To unsubscribe from this discussion, e-mail: [users-
>>>>>unsubscribe at gridengine.sunsource.net].
>>>>
>>>>
>>>>
>>>>-- 
>>>>#################################################################### 
>>>>##
>>>>#####
>>>># John Foley                          # Location:  IL93-
>>>>E1-21S            #
>>>># IT & Systems Administration         # Maildrop:  IL93-
>>>>E1-35O            #
>>>># Antenna & Mechanical Simulation Grp #    Email:
>>>>john.foley at motorola.com #
>>>># Motorola, Inc. -  Mobile Devices    #    Phone: (847)
>>>>523-8719          #
>>>># 600 North US Highway 45             #      Fax: (847)
>>>>523-5767          #
>>>># Libertyville, IL. 60048  (USA)      #     Cell: (847)
>>>>460-8719          #
>>>>#################################################################### 
>>>>##
>>>>#####
>>>>                (this email sent using Mozilla on Windows)
>>>>
>>>>------------------------------------------------------
>>>>http://gridengine.sunsource.net/ds/viewMessage.do?
>>>>dsForumId=38&dsMessageId=215836
>>>>
>>>>To unsubscribe from this discussion, e-mail: [users-
>>>>unsubscribe at gridengine.sunsource.net].
>>>
>>>
>>>------------------------------------------------------
>>>http://gridengine.sunsource.net/ds/viewMessage.do? 
>>>dsForumId=38&dsMessageId=215840
>>>
>>>To unsubscribe from this discussion, e-mail: [users- 
>>>unsubscribe at gridengine.sunsource.net].
>>
>>
>>
>>-- 
>>###################################################################### 
>>#####
>># John Foley                          # Location:  IL93- 
>>E1-21S            #
>># IT & Systems Administration         # Maildrop:  IL93- 
>>E1-35O            #
>># Antenna & Mechanical Simulation Grp #    Email:  
>>john.foley at motorola.com #
>># Motorola, Inc. -  Mobile Devices    #    Phone: (847)  
>>523-8719          #
>># 600 North US Highway 45             #      Fax: (847)  
>>523-5767          #
>># Libertyville, IL. 60048  (USA)      #     Cell: (847)  
>>460-8719          #
>>###################################################################### 
>>#####
>>                 (this email sent using Mozilla on Windows)
>>
>>------------------------------------------------------
>>http://gridengine.sunsource.net/ds/viewMessage.do? 
>>dsForumId=38&dsMessageId=215843
>>
>>To unsubscribe from this discussion, e-mail: [users- 
>>unsubscribe at gridengine.sunsource.net].
> 
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=215845
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



-- 
###########################################################################
# John Foley                          # Location:  IL93-E1-21S            #
# IT & Systems Administration         # Maildrop:  IL93-E1-35O            #
# Antenna & Mechanical Simulation Grp #    Email: john.foley at motorola.com #
# Motorola, Inc. -  Mobile Devices    #    Phone: (847) 523-8719          #
# 600 North US Highway 45             #      Fax: (847) 523-5767          #
# Libertyville, IL. 60048  (USA)      #     Cell: (847) 460-8719          #
###########################################################################
                 (this email sent using Mozilla on Windows)

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=215864

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list