[GE users] Problem filling up cores on a node in a PE using fill_up allocation

jlopez jlopez at cesga.es
Wed Feb 18 13:33:28 GMT 2009


Hi,

The temporary workaround we have found is to define the consumable 
complexes inside the queue.

complex_values        num_proc=16,s_vmem=112G,h_vmem=113G,h_fsize=900G

In your case I would suggest adding slots=8 or if it does not work you 
could define a new consumable complex.

After that fill_up works as expected but only if one queue is assigned 
at a time to a node. If several queues use the same node you still get 
oversubscription between the queues because each one works independently 
and the problem that the consumable complex of the node is not taken 
into account persists.

This is what we are using at the moment to overcome this issue.

Cheers,
Javier


reuti wrote:
> Am 17.02.2009 um 16:48 schrieb leonardz:
>
>   
>> SGE 6.2u1
>>
>> qmaster is installed on a Solaris 10 5/08 s10x_u5wos_10 X86 opteron  
>> system execd is installed on a SUSE Linux
>> Enterprise Server 10 SP2 (x86_64) dual core opteron nodes (4 cores  
>> per node)
>>
>>
>> I am trying to have 2 parallel environments as we have only a GigE  
>> network. So the goal for PE ompitest is:
>>
>> only allow parallel jobs which can fit on a single node: using  
>> pe_slots for allocation works - all tasks are scheduled
>> on one node, and if more tasks are requested  than cores on a node,  
>> that job does not get scheduled in this PE.
>>
>> The goal for PE ompilargetest is to allocate all cores on a node  
>> before allocating cores on the next node using
>> fill_up for allocation : this will allow multi-node parallel jobs
>>
>> this does not work. It schedules all tasks to only one node, and  
>> oversubscribes the node as long as $fillup is used.
>>
>> If I want more tasks than cores on a node without oversubscription,  
>> I need to use round_robin, which guarantees more
>> stress on the network. I really want all cores on a node filled  
>> before tasks are scheduled on a different node.
>>
>> Is it possible, with sge6.2u1 to pack nodes with tasks, before  
>> allocating cores to the next node without
>> oversubscription ?
>>     
>
> There is an issue with $fill_up in 6.2u1:
>
> http://gridengine.sunsource.net/issues/show_bug.cgi?id=2901
>
> When it's not fixed in u2, I would even suggest to raise priority for  
> this issue.
>
> -- Reuti
>
>
>   
>> Details
>>
>> For most users we want to insist that of all the slots (16 in the  
>> test case) , each job can only be scheduled on a
>> single node with 4 cores, and not run over the network:
>> qconf -sp ompitest
>> pe_name            ompitest
>> slots              16
>> user_lists         NONE
>> xuser_lists        NONE
>> start_proc_args    /bin/true
>> stop_proc_args     /bin/true
>> allocation_rule    $pe_slots
>> control_slaves     TRUE
>> job_is_first_task  FALSE
>> urgency_slots      min
>> accounting_summary TRUE
>>
>> The only queue this can run on is ompitest.q and it limits the  
>> number of slots to 4 per job.
>>
>> qconf -sq ompitest.q
>> qname                 ompitest.q
>> hostlist                BLAH
>> seq_no                1,[cn-r3-4=1],[cn-r3-5=2],[cn-r3-6=3],[cn- 
>> r3-7=4]
>> load_thresholds       np_load_avg=4.5
>> suspend_thresholds    NONE
>> nsuspend              1
>> suspend_interval      00:05:00
>> priority              0
>> min_cpu_interval      00:05:00
>> processors            UNDEFINED
>> qtype                 BATCH
>> ckpt_list             NONE
>> pe_list               ompitest
>> rerun                 FALSE
>> slots                 4
>>
>>
>> This appears to work.
>>
>> For users who need more cores, and do not communicate heavily  
>> between processes, I want all cores on a node to be used
>> before allocating to another node:
>>
>> qconf -sp ompilargetest
>> pe_name            ompilargetest
>> slots              16
>> user_lists         NONE
>> xuser_lists        NONE
>> start_proc_args    /bin/true
>> stop_proc_args     /bin/true
>> allocation_rule    $fill_up
>> control_slaves     TRUE
>> job_is_first_task  FALSE
>> urgency_slots      min
>> accounting_summary TRUE
>>
>>
>> and the only queue to use this PE:
>> qconf -sq ompilargetest.q
>> qname                 ompilargetest.q
>> hostlist              BLAH
>> seq_no                STUFF
>> load_thresholds       np_load_avg=4.5
>> suspend_thresholds    NONE
>> nsuspend              1
>> suspend_interval      00:05:00
>> priority              0
>> min_cpu_interval      00:05:00
>> processors            UNDEFINED
>> qtype                 BATCH
>> ckpt_list             NONE
>> pe_list               ompilargetest
>> rerun                 FALSE
>> slots                 8
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do? 
>> dsForumId=38&dsMessageId=108209
>>
>> To unsubscribe from this discussion, e-mail: [users- 
>> unsubscribe at gridengine.sunsource.net].
>>     
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=108213
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=108968

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

    [ Part 2, "jlopez.vcf"  Text/X-VCARD (Name: "jlopez.vcf") ~367 bytes. ]
    [ Unable to print this part. ]



More information about the gridengine-users mailing list