[GE users] Fair share config, fill-up hosts and max user slots
minet at cism.ucl.ac.be
Wed Jan 11 09:51:27 GMT 2006
[ The following text is in the "ISO-8859-1" character set. ]
[ Your display is set for the "ISO-8859-10" character set. ]
[ Some special characters may be displayed incorrectly. ]
I am a bit puzzled... in order to overcome the apparent bug we are discussing
(fair share usage not taken into account), I have defined a functionnal tree
(all users with equal n° of shares). Since then... it seems usage is accounted
for in the fair share tree/policy (actual resource share and combined usage are
calculated and displayed properly; targeted usage remains 0)!!! I am not sure
of the causality between the two, but while I repeatedly had O as value for
stckt (with qstat -ext), this is no longer the case and I am not aware of having
done anything else in the config which could impact on the faire share
policy/usage (besides, I indeed also set enforce_project to false in the cluster
config [it was set to true before], and set the default project to NONE for all
users, whereas it was set to specific projects before)...
Now, there is still a problem with usage accounting, as we have an infiniband
interconnect, and the tight integration doesn't work with MPI jobs. I have
looked in the How-To's and found a package for IBA tight integration, but the
installed version of mvapich is earlier the one required to apply the patch on
the mpirun_rsh.c. So for those jobs, resources are not accounted for
adequately... but this is a wider problem than SGE itself. I am working on
getting mvapich upgraded.
Finally, the question I had for the pending job remains valid: if some jobs are
in the scheduler waiting area (not being dispatched because max_u_job is
reached, or because resources are not available), shouldn't the scheduler also
display tickets/urgency and priority information for those jobs?
Thnks again for your help
Stephan Grell - Sun Germany - SSG - Software Engineer wrote:
> Hi Jean-Paul,
> we do have a bug with displaying the sharetree data. I could not find any
> issue with the actual sharetree computation. Those test were all successful.
> How long did you wait? Did you wait until the scheduler made it run? The
> information is only available, when the scheduler is running and after
> it has finished
> its first run.
> Could you give me your entire configuration, that is related to this
> - qstat -prio
> - qstat -ext
> - user configuration involved
> - project configuration involved
> - resource configuration qstat -sc
> - qconf -sss output
> - sharetree config.
> Sofar I can not replicate your issue. Did you build your binaries yourself?
> Which archs are you using?
> Jean-Paul Minet wrote On 01/06/06 12:00,:
>>Trying to work around the possible fair share bug (is it confirmed?), I am
>>trying to combine functional policy and urgency (wait time only). I have got
>>the scheduler config (with slot urgency set to 0) detailed below. When I do a
>>"qstat -prio", all pending jobs report 0 as "nurg" and "ntckts", whatever their
>>waiting time is. Is that the expected behavior?
>>Output of qconf -ssconf:
>>Stephan Grell - Sun Germany - SSG - Software Engineer wrote:
>>>I just did the test with the env you describe. I am sure, that you found
>>>a bug. In my tests, the
>>>targeted resource share is allways 0 as you describe it. However, the
>>>actual resource share
>>>is reported correctly.
>>>Jean-Paul Minet wrote:
>>>>Our bi-proc cluster is used for sequential, OpenMP and MPI jobs. We
>>>>1) use fair-share scheduling with equal shares for all users
>>>>I have disabled Priority and Urgency scheduling, and set policy
>>>>hierarchy to S.:
>>>>lemaitre ~ # qconf -ssconf
>>>>Under the share tree policy, I have only defined a default leaf under
>>>>which all users appear, but "Actual resource share" and "Targeted
>>>>resource share" remain 0 for all users, as if actual usage was not
>>>>taken into account? This is confirmed by jobs being dispatched more
>>>>like in FIFO order than following past usage. What's wrong?
>>>>2) limit the total number of CPUs/slots used by any user at any time:
>>>>MaxJobs/User doesn't help as a single MPI job can use many slots and
>>>>therefore cannot compare to a sequential job. How can we implement this?
>>>>3) fill-up hosts with sequential jobs to leave as many empty nodes for
>>>>OpenMP and MPI jobs. I have read Stephen G. WebL Log: am I correct in
>>>>assuming that I have to define a complex_values slots=2 for each of
>>>>the biproc host (we don't want more jobs than CPU) and, thereafter,
>>>>the scheduler will select the hosts with the least available slots
>>>>(setting of course queue_sort_method=load and load_formula=slots) ?
>>>>Thanks for any help
>>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>For additional commands, e-mail: users-help at gridengine.sunsource.net
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
Gestionnaire CISM - Institut de Calcul Intensif et de Stockage de Masse
Université Catholique de Louvain
Tel: (32) (0)10.47.35.67 - Fax: (32) (0)10.47.34.52
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net
More information about the gridengine-users