[GE users] sort by sequence number

Reuti reuti at staff.uni-marburg.de
Wed Apr 2 12:12:30 BST 2008


Hi,

Am 01.04.2008 um 16:35 schrieb Paul MacInnis:
> Hi,
>
> We have some nodes that have extra features such as additional  
> memory or
> low latency interconnects.  These nodes are part of the same cluster
> queues as all other nodes and can be selected by using various -l
> parameters.  We want to allow regular jobs to be scheduled on these
> nodes also but only when there are no slots available on the non- 
> special
> nodes.

are the other jobs requesting something like memory or disk-space  
which might already be used up on the apparently free nodes? Is there  
any default request for the new jobs which end up on the other nodes  
as they ned some resources?

-- Reuti

PS: some soting is also done in libs/sched/sge_select_queue.c

          if (a->is_soft) {
             if (sconf_get_queue_sort_method() == QSM_LOAD)
                lPSortList(a->queue_list, "%I+ %I+ %I+",  
QU_soft_violation, QU_host_seq_no, QU_seq_no);
             else
                lPSortList(a->queue_list, "%I+ %I+ %I+",  
QU_soft_violation, QU_seq_no, QU_host_seq_no);


> We attempt to do this by assigning a low sequence number to ordinary
> nodes and a higher sequence number to the special nodes and requesting
> that the scheduler sort nodes by sequence number.
>
> I've often noticed ordinary jobs running on the special nodes when  
> I know
> that there are and have been free slots on the non-special nodes.  The
> impression I've developed is that nodes are being selected by load  
> and not
> by sequence number.
>
> Looking in file gridengine/source/daemons/schedd/scheduler.c in the
> older SGE source code directory tree for schedd as a separate  
> daemon, or
> file gridengine/source/daemons/qmaster/sge_sched_thread.c in the
> current source code tree for sched as a thread in qmaster,
> there is this piece of code:
>
>    / 
> *---------------------------------------------------------------------
>     * SORT HOSTS
>     
> *--------------------------------------------------------------------- 
> */
>    /*
>       there are two possibilities for SGE administrators
>       selecting queues:
>
>       sort by seq_no
>          the sequence number from configuration is used for sorting
>
>       sort by load (using a load formula)
>          the least loaded queue gets filled first
>
>          to do this we sort the hosts using the load formula
>          because there may be more queues than hosts and
>          the queue load is identically to the host load
>
>    */
>    switch (queue_sort_method) {
>    case QSM_LOAD:
>    case QSM_SEQNUM:
>    default:
>
>       DPRINTF(("sorting hosts by load\n"));
>       sort_host_list(lists->host_list, lists->centry_list);
>
>
>       break;
>    }
>
>
> This doesn't look right.
>
> As written the switch/case statements seem to do nothing - sorting is
> always done by host load.  In spite of what the comments say,
> QSM_SEQNUM is ignored here.
>
> Is there some other place where the load sorted list is then grouped
> and ordered by sequence number?
>
> Paul
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list