AW: [GE users] dry run and load thresholds

Andreas Haas Andreas.Haas at Sun.COM
Mon May 10 14:45:11 BST 2004


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

On Mon, 10 May 2004 Dalibor.Tokic at avinci.de wrote:

> Hi!
>
> Thanx for all your responses!
>
> Let me explain you my situation:
> We got here a grid with about 100 PC's splitted into several PE's.
> Every PE stands for a processor-class (PE-Opteron, PE-Xeon, PE-PIII, ...).
>
> So the idea is, that a user first tries the most powerful PE (Opteron). If the job can't run there immediately,
> a weaker PE should be tried and so on. This could also be scripted easily. A request for "PE-*" won't work, because I don't know, which free PE will be chosen.

Though I guess it will not help you now you might be interested
in knowing 6.0 in case of wildcard PE the best fit is now chosen.
Fits best in this case means:

(1) Optimize resource allocation during resource allocation separately
    for each PE assignment in question. For this the very same rule set
    is in effect depending on sched_conf(5) 'queue_sort_method' as used
    with sequential assignments in 5.3.

(2) All possible jobs assignments of multiple possible PEs are then
    compared: The assignment that causes the smallest number of soft
    request violations is chosen for the job.

In case the job additionally was submitted with slot ranges the
assignment that gets the job the largerst number of slots is
chosen (see sge_select_parallel_environment() in sources).

>
> So that's the reason for using the "dry-run". Just check, if the job will start and then go for it.
> But I see now, that the "just verify"-option only checks for inconsistencies in the job.
>
> The option "Start Job Immediately", is probably the solution. Either the job will run now or not at all.
> This way I can go from PE to PE, until I reach one, where the job runs immediately.

Agreed.

>
> So I think, there's no need to change the way the "dry-run" works, although the additional option "-w V" suggested by
> Andreas would be nice.
>
> Thanks again for all your responses.
>
>
> Dalibor
>
>
> > -----Ursprüngliche Nachricht-----
> > Von: Andreas Haas [mailto:Andreas.Haas at Sun.COM]
> > Gesendet: Samstag, 8. Mai 2004 11:16
> > An: users at gridengine.sunsource.net
> > Betreff: Re: [GE users] dry run and load thresholds
> >
> >
> > As outlined by Charu currently the statement of "-w v"
> > returning "true" is that
> >
> >    "it is possible to run a job"
> >
> > whereas dry run with debitations in effect would change
> > that statement into
> >
> >    "it is possible to dispatch the job now"
> >
> > in contrast to this the schedd_info information as returned
> > by qstat -j gets you
> >
> >    "why wasn't it possible to dispatch the job with the last
> >     scheduling interval?"
> >
> > note that having qmaster do "-w v" verification with
> > debitations be in effect would also cause consumable
> > current utilization be taken into account accordingly!
> >
> > ... with the outcome that "-w v" statement could be perceived
> > an erratic behaviour: It would change each time depending on the
> > current cluster resource utilization situation.  But Andy is
> > certainly on the right track that this might be excatly the
> > behavoiur one wishes to see and in fact it could be done with
> > little effort by implementing yet another -w option argument
> > such as "V" or something similar.
> >
> > Possibly we should go over to dev@ in case there is interest
> > to deepen that debate ...
> >
> > Andreas
> >
> > On Fri, 7 May 2004, Rayson Ho wrote:
> >
> > > >I hope a programmer might help you to find the code
> > section where you
> > > need
> > > >to change qmaster, but filing an RFE that it should be
> > possible to answer
> > > >both "questions" (look at empty or current cluster) with
> > the dry run
> > > option
> > > >might be a good idea.
> > >
> > > Hi Dalibor,
> > >
> > > Just look at libs/sched/sge_complex_schedd.h:
> > >
> > >    QS_STATE_FULL
> > >       All debitations caused by running jobs are in effect.
> > >    QS_STATE_EMPTY
> > >       We ignore all debitations caused by running jobs.
> > >       Ignore all but static load values.
> > >
> > > And in daemons/qmaster/sge_job.c:
> > >
> > >          if (try_it) {
> > >             int prev_dipatch_type = DISPATCH_TYPE_NONE;
> > >
> > >             /* imagine qs is empty */
> > >             set_qs_state(QS_STATE_EMPTY);
> > >
> > >             ...
> > >
> > >             /* stop dreaming */
> > >             set_qs_state(QS_STATE_FULL);
> > >          }
> > >
> > > You can play around with the QS_STATE_EMPTY/QS_STATE_FULL
> > settings in
> > > sge_job.c. Ask further questions on the dev list...
> > >
> > > Rayson
> > > ---------------------------------------------------------
> > > Get your FREE E-mail account at http://www.eseenet.com !
> > >
> > >
> > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > > For additional commands, e-mail: users-help at gridengine.sunsource.net
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list