[GE users] Sort by sequence number question

Lönroth Erik erik.lonroth at scania.com
Tue Jul 10 12:42:09 BST 2007


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Yes this is what I wan't... And I have done exactly like this, but the scheduler seems to ignore the sequence number regardless of how I set the sequence number. ts101-1-0 will always get the SLAVES.

This is how it looks:

bash-3.00$ cat spool/qmaster/cqueues/short.101.q 
qname              short.101.q
hostlist           @ts101_X_hg
seq_no             0,[ts101-1-0.sss.se.scania.com=1]
load_thresholds    np_load_avg=1.75
suspend_thresholds NONE
nsuspend           1
suspend_interval   00:05:00
priority           0
min_cpu_interval   00:05:00
processors         UNDEFINED
qtype              BATCH INTERACTIVE
ckpt_list          NONE
pe_list            make powerflow_ts101_pe
rerun              FALSE
slots              4
tmpdir             /tmp
shell              /bin/csh
prolog             NONE
epilog             NONE
shell_start_mode   posix_compliant
starter_method     NONE
suspend_method     NONE
resume_method      NONE
terminate_method   NONE
notify             00:00:60

... And despite this I get:

    202 0.55500 nano       sssler       r     07/10/2007 13:35:49 master.101.q at ts101-1-0.sss.se. MASTER        
    202 0.55500 nano       sssler       r     07/10/2007 13:35:49 short.101.q at ts101-1-0.sss.se.s SLAVE         
                                                                  short.101.q at ts101-1-0.sss.se.s SLAVE         
                                                                  short.101.q at ts101-1-0.sss.se.s SLAVE         
                                                                  short.101.q at ts101-1-0.sss.se.s SLAVE


The only way for me to get jobs onto ts101-1-1 is to set "slots=0" for ts101-1-0 on in the short.101.q, which is not what I want, since I want to be able to run jobs on that node whenever a master-slot is not used.

Something is very wrong.

/Erik


-----Original Message-----
From: Ravichandra.Nallan at Sun.COM [mailto:Ravichandra.Nallan at Sun.COM] 
Sent: den 10 juli 2007 13:03
To: users at gridengine.sunsource.net
Subject: Re: [GE users] Sort by sequence number question



 From the info provided it looks like you have 2 queues, master.101.q 
and short.101.q
And short.101.q has 2 hosts(101-1-0, 101-1-1), and you want jobs to 
start on one host before other, right?

Did you set the seq_no to short.101.q? can you qconf -sq short.101.q | 
grep seq ?
If I am right, by setting the seq_no to you choose one queue over other. 
But in you case you need to set the seq_no per host as you want 
short.101.q at 101-1-1 to be allotted first before short.101.q at 101-1-0. i.e
seq_no   1,[ts101-1-1.sss.se.s=2],[ts101-1-0.sss.se.s=3]
should do the trick.
let me know if it helps,
regards,
Ravi

Lönroth Erik wrote:
> Regardless how I try - this is always the outcome.
>
>     187 0.55500 nano       sssler       r     07/10/2007 12:40:10 master.101.q at ts101-1-0.sss.se. MASTER        
>     187 0.55500 nano       sssler       r     07/10/2007 12:40:10 short.101.q at ts101-1-0.sss.se.s SLAVE         
>                                                                   short.101.q at ts101-1-0.sss.se.s SLAVE         
>                                                                   short.101.q at ts101-1-0.sss.se.s SLAVE         
>                                                                   
> short.101.q at ts101-1-0.sss.se.s SLAVE
>
> The MASTER and SLAVES turns up on the same node.
>
>
>
> /Erik
>
>
> -----Original Message-----
> From: Lönroth Erik [mailto:erik.lonroth at scania.com]
> Sent: den 10 juli 2007 12:32
> To: users at gridengine.sunsource.net
> Subject: RE: [GE users] Sort by sequence number question
>
>
> I'm sure it worked before, but somehow - the scheduler now keeps 
> assigning jobs and to my desperation I'm starting to think I'm crazy. 
> It seems to ignore my "sequence numbers" entirely at the moment.
>
> I'll try fiddle with the "round robin" thing, but fill_up is what I 
> really want.
>
> /Erik
>
> -----Original Message-----
> From: Reuti [mailto:reuti at staff.uni-marburg.de]
> Sent: den 10 juli 2007 12:18
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] Sort by sequence number question
>
>
> Hi,
>
> I remember this discussion:
>
> http://gridengine.sunsource.net/servlets/ReadMsg?list=users&msgNo=2002
> 2
>
> It didn't solve your setup problem?
>
> -- Reuti
>
> PS: You could try to use $round_robin instead of $fill_up.
>
>
> Am 10.07.2007 um 11:24 schrieb Lönroth Erik:
>
>   
>> Hello!
>>
>> I have setup "sort by sequence number" for my cell, in "cluster
>> configuration" - this is because I want a few specific nodes in my
>> cluster to be "considered in last hand" when assigning jobs.
>>
>> Lets say I have 10 nodes, where the first 2 nodes are to be 
>> considered
>> last. I have assigned them a sequence number "99" (just
>> a high value) specifically in the cluster queue "short.q".  
>> Regardless of how I set this sequence number - the nodes I wan't to  
>> be considered last still gets included.
>>
>> The nodes I want to be "considered last" are "MASTER" nodes, thats 
>> why
>> I don't want any additional jobs running on them - unless there
>> is absolutely nessesary.
>>
>> This is the queue situation where ts101-1-0 has a higher sequence
>> number then ts101-1-1 (considered last?):
>>
>> master.101.q at ts101-1-0.sss.se. BIPC  0/1       0.00     lx26-amd64
>> ---------------------------------------------------------------------
>> -
>> ------
>> short.101.q at ts101-1-1.sss.se.s BIPC  0/4       0.00     lx26-amd64
>> ----------------------------------------------------------------------
>> ------
>> short.101.q at ts101-1-0.sss.se.s BIPC  0/4       0.00     lx26-amd64
>>
>>
>> My PE is configured as:
>> pe_name           generic_pe
>> slots             9999
>> user_lists        NONE
>> xuser_lists       NONE
>> start_proc_args   /opt/gridengine/apps/start_generic_pe.sh  
>> $pe_hostfile
>> stop_proc_args    /opt/gridengine/apps/stop_generic_pe.sh
>> allocation_rule   $fill_up
>> control_slaves    FALSE
>> job_is_first_task TRUE
>> urgency_slots     min
>>
>> ----- At submit time -----
>> When I submit a job (asking for 1 MASTER + 4 SLAVES) and no other
>> specific requirements:
>>
>>     qsub -masterq master.*.q -pe generic_pe 5 basic-4-slots.sh
>>
>> Now - I would expect ts101-1-0  - NOT to have any SLAVES allocated to
>> it. BUT!
>>
>> ---- The allocation map -----
>> qstat -t
>>
>>     176 0.55500 nano       sssler       r     07/10/2007 11:16:38  
>> master.101.q at ts101-1-0.sss.se. MASTER
>>     176 0.55500 nano       sssler       r     07/10/2007 11:16:38  
>> short.101.q at ts101-1-0.sss.se.s SLAVE
>>                                                                    
>> short.101.q at ts101-1-0.sss.se.s SLAVE
>>                                                                    
>> short.101.q at ts101-1-0.sss.se.s SLAVE
>>                                                                    
>> short.101.q at ts101-1-0.sss.se.s SLAVE
>>
>>
>> What am I doing wrong here? I want the situation to look like this
>> (but it doesn't)
>>
>>
>>     176 0.55500 nano       sssler       r     07/10/2007 11:16:38  
>> master.101.q at ts101-1-0.sss.se. MASTER
>>     176 0.55500 nano       sssler       r     07/10/2007 11:16:38  
>> short.101.q at ts101-1-1.sss.se.s SLAVE
>>                                                                    
>> short.101.q at ts101-1-1.sss.se.s SLAVE
>>                                                                    
>> short.101.q at ts101-1-1.sss.se.s SLAVE
>>                                                                    
>> short.101.q at ts101-1-1.sss.se.s SLAVE
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>     
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>   

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list