[GE users] Sort by sequence number question

Paul MacInnis macinnis at dal.ca
Wed Jul 11 12:42:05 BST 2007


On Wed, 11 Jul 2007, [iso-8859-1] L?nroth Erik wrote:

> yes, it is set. Still no luck on this.
> 
> The only way I can force the damn slaves off the MASTER node, is to remove the requested "PE" explicitly from the nodes. This is not what I would want, but I just can't make it happen. It simply ignores my sequence number alltogether. I have recreated all queues and restarted the qmaster and scheduler, but no luck whatsoever.
> 
> Is there something else affecting the effect of "sequence number" outside of the general queue configuration and the cluster config?

I would like to add our experience to this discussion.

We recently switched from SGE5.3 to 6.1.  We have 1G, 2G and 4G nodes in
our cluster.  If a job doesn't specify special memory requirements
we want it scheduled to the smallest memory machine available.

For 4 years with SGE5.3 this worked well.  We assigned sequence number
1965 to the 1G nodes, 2965 to the 2G nodes and 4965 to the 4G nodes.

With SGE6.1 we defined our queues with
seq_no  1965,[@2g.hg=2965],[@4g.hg=4965]

@2g.hg being the 2G host group nodes and @4g.hg being the 4G nodes.

With qconf -msconf we defined:
queue_sort_method     seqno

qstat -F presents the queues correctly ordered by this seqno.  However
jobs are being scheduled to 2G and 4G nodes when there are 1G nodes
available!

This never happened in SGE5.3!

It seems that in 6.1 either
1. "queue_sort_method  seqno" isn't working for queue selection or
2. there is some other queue selection criteria that overrides
   "queue_sort_method  seqno"

Any thoughts?

Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list