[GE users] dry run and load thresholds

Dalibor.Tokic Dalibor.Tokic at avinci.de
Mon May 10 14:53:32 BST 2004


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

> On Mon, 10 May 2004 Dalibor.Tokic at avinci.de wrote:
> 
> > Hi!
> >
> > Thanx for all your responses!
> >
> > Let me explain you my situation:
> > We got here a grid with about 100 PC's splitted into several PE's.
> > Every PE stands for a processor-class (PE-Opteron, PE-Xeon, 
> PE-PIII, ...).
> >
> > So the idea is, that a user first tries the most powerful 
> PE (Opteron). If the job can't run there immediately,
> > a weaker PE should be tried and so on. This could also be 
> scripted easily. A request for "PE-*" won't work, because I 
> don't know, which free PE will be chosen.
> 
> Though I guess it will not help you now you might be interested
> in knowing 6.0 in case of wildcard PE the best fit is now chosen.
> Fits best in this case means:
> 
> (1) Optimize resource allocation during resource allocation separately
>     for each PE assignment in question. For this the very 
> same rule set
>     is in effect depending on sched_conf(5) 
> 'queue_sort_method' as used
>     with sequential assignments in 5.3.
> 
> (2) All possible jobs assignments of multiple possible PEs are then
>     compared: The assignment that causes the smallest number of soft
>     request violations is chosen for the job.
> 
> In case the job additionally was submitted with slot ranges the
> assignment that gets the job the largerst number of slots is
> chosen (see sge_select_parallel_environment() in sources).

That's definetely the best way. 6.0 is indeed very promising.
:-)


> >
> > So that's the reason for using the "dry-run". Just check, 
> if the job will start and then go for it.
> > But I see now, that the "just verify"-option only checks 
> for inconsistencies in the job.
> >
> > The option "Start Job Immediately", is probably the 
> solution. Either the job will run now or not at all.
> > This way I can go from PE to PE, until I reach one, where 
> the job runs immediately.
> 
> Agreed.
> 
> >
> > So I think, there's no need to change the way the "dry-run" 
> works, although the additional option "-w V" suggested by
> > Andreas would be nice.
> >
> > Thanks again for all your responses.
> >
> >
> > Dalibor
> >
> >
> > > -----Ursprüngliche Nachricht-----
> > > Von: Andreas Haas [mailto:Andreas.Haas at Sun.COM]
> > > Gesendet: Samstag, 8. Mai 2004 11:16
> > > An: users at gridengine.sunsource.net
> > > Betreff: Re: [GE users] dry run and load thresholds
> > >
> > >
> > > As outlined by Charu currently the statement of "-w v"
> > > returning "true" is that
> > >
> > >    "it is possible to run a job"
> > >
> > > whereas dry run with debitations in effect would change
> > > that statement into
> > >
> > >    "it is possible to dispatch the job now"
> > >
> > > in contrast to this the schedd_info information as returned
> > > by qstat -j gets you
> > >
> > >    "why wasn't it possible to dispatch the job with the last
> > >     scheduling interval?"
> > >
> > > note that having qmaster do "-w v" verification with
> > > debitations be in effect would also cause consumable
> > > current utilization be taken into account accordingly!
> > >
> > > ... with the outcome that "-w v" statement could be perceived
> > > an erratic behaviour: It would change each time depending on the
> > > current cluster resource utilization situation.  But Andy is
> > > certainly on the right track that this might be excatly the
> > > behavoiur one wishes to see and in fact it could be done with
> > > little effort by implementing yet another -w option argument
> > > such as "V" or something similar.
> > >
> > > Possibly we should go over to dev@ in case there is interest
> > > to deepen that debate ...
> > >
> > > Andreas
> > >
> > > On Fri, 7 May 2004, Rayson Ho wrote:
> > >
> > > > >I hope a programmer might help you to find the code
> > > section where you
> > > > need
> > > > >to change qmaster, but filing an RFE that it should be
> > > possible to answer
> > > > >both "questions" (look at empty or current cluster) with
> > > the dry run
> > > > option
> > > > >might be a good idea.
> > > >
> > > > Hi Dalibor,
> > > >
> > > > Just look at libs/sched/sge_complex_schedd.h:
> > > >
> > > >    QS_STATE_FULL
> > > >       All debitations caused by running jobs are in effect.
> > > >    QS_STATE_EMPTY
> > > >       We ignore all debitations caused by running jobs.
> > > >       Ignore all but static load values.
> > > >
> > > > And in daemons/qmaster/sge_job.c:
> > > >
> > > >          if (try_it) {
> > > >             int prev_dipatch_type = DISPATCH_TYPE_NONE;
> > > >
> > > >             /* imagine qs is empty */
> > > >             set_qs_state(QS_STATE_EMPTY);
> > > >
> > > >             ...
> > > >
> > > >             /* stop dreaming */
> > > >             set_qs_state(QS_STATE_FULL);
> > > >          }
> > > >
> > > > You can play around with the QS_STATE_EMPTY/QS_STATE_FULL
> > > settings in
> > > > sge_job.c. Ask further questions on the dev list...
> > > >
> > > > Rayson
> > > > ---------------------------------------------------------
> > > > Get your FREE E-mail account at http://www.eseenet.com !
> > > >
> > > >
> > > 
> ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: 
> users-unsubscribe at gridengine.sunsource.net
> > > > For additional commands, e-mail: 
> users-help at gridengine.sunsource.net
> > > >
> > > >
> > >
> > > 
> ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > > For additional commands, e-mail: 
> users-help at gridengine.sunsource.net
> > >
> > >
> >
> > 
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list