[GE users] Job fragmentation of sge 6.0u8

Reuti reuti at staff.uni-marburg.de
Mon Jul 28 18:29:52 BST 2008


Am 28.07.2008 um 18:29 schrieb Alessio Comisso:

> Il giorno 28/lug/08, alle ore 17:14, Reuti ha scritto:
>
> Hello,
>> Hi,
>>
>> Am 28.07.2008 um 17:28 schrieb Alessio Comisso:
>>
>>> Dear all,
>>> This is the first time I am writing to this forum, I hope it is  
>>> the right place where to ask for support.
>>
>> yes - sure :-)
>>
>>> I have a cluster running mpi jobs, but happens that more jobs are  
>>> scheduled in the same node, for instance a qstat -f gives
>>>
>>> infini.q at node089.beowulf.clust BIP   4/4       4.00     lx26-amd64
>>>   10568 0.53560 x88-PTCDA- toton        r     07/27/2008  
>>> 08:50:15     2
>>>   10632 0.62232 test-PAW   levita       r     07/28/2008  
>>> 12:53:39     2
>>
>> You defined 4 slots on this node, anmd SGE scheduled 4 tasks to it.
>>
>>>
>>> This in-homogeneous usage is not optimal, as the CPU usage is  
>>> very low.
>>>
>>>                                                                      
>>>        %CPU
>>> 22190 levita    25   0  289m 182m 5772 R 53.3  2.3 108:12.66  
>>> pwcapablanca.x
>>> 22191 levita    25   0  289m 189m  13m R 51.3  2.4 110:20.80  
>>> pwcapablanca.x
>>> 12688 toton     25   0  459m 411m  13m R 48.6  5.2   1031:20 siesta
>>> 12689 toton     25   0  207m 147m 6072 R 46.6  1.9   1037:39 siesta
>>
>> So, it looks like the machine has only 2 core, not 4. So what must  
>> be adjusted is the number of slots in the queue configuration:
>
> No no, the machine has 4 cores, I think that the communication  
> patterns are limiting the performances. If the jobs are homogeneus  
> you get 4 processes 99.9%.

On the one hand this would explain why 4 slots are defined for this  
node. But all seems to be in best order. You want two cores idling on  
this machine? What is your allocation rule in the requested PE?

One way to get a complete node for a job is to set "allocation_rule  
$pe_slots", and request all memory in the machine by a proper setup  
of "virtual_free" or "h_vmem" of your choice. As all memory is  
allocated already to one job, nothing else will be scheduled there  
although slots are free.

-- Reuti


> Alessio
>
>>
>> qconf -mq infini.q
>>
>> and watch out for the line "slots" which should be 2 (at least for  
>> the node "node089.beowulf.cluster".
>>
>> -- Reuti
>>
>>
>>>
>>>
>>> How can I tell the nodes to accept only a single job (or a single  
>>> user)? The installed version does not support the quotas.
>>>
>>> Kind Regards
>>> Alessio
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list