[GE users] job distribution in every execution host issue

Reuti reuti at staff.uni-marburg.de
Mon Aug 13 23:31:44 BST 2007


Am 13.08.2007 um 19:02 schrieb Benson Fung:

> We have setup a grid engine infrastructure with 4 execution hosts  
> in place.  And we have submitted 200 jobs into the grid engine.   
> Unfortunately, we found out that the no. of executed job in each of  
> the execution host is not evenly distributed, and the performance  
> is very slow.

the performance of you jobs, i.e. applications, or the distribution  
to the nodes by SGE? What setting for SGE did you choose during  
install of the qmaster for the performance of the scheduler? What  
does your queue definition (qconf -sq <queuename>) and scheduler  
setting (qconf -ssconf) look like?

>   I tried to execute all 200 jobs in one host to compare with 4  
> execution hosts, it is found that the performance of executing  
> those 200 jobs in one host is better those 200 jobs executed in 4  
> execution hosts.

You mean, your job (one run of the 200) will need let's say 1 hour  
per job if running all on one machine, but more than 4 hrs if  
spreading the jobs around in the cluster - hence the total wallclock  
time would be larger this way?

> Can anyone suggest if there is any configuration parameter we need  
> to setup in the grid engine for improving the performance?
> BTW, please advise the difference between slot and queue.

A queue describes the setting and limits for jobs to run in. One of  
the settings in the queue's definition is slots, i.e. how many jobs  
may run in this queue on a particular host at once. Special  
considerations must be taken, if you have more than one queue per  
host to avoid oversubscription.

-- Reuti

To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list