[GE users] Wallclock accounting and parallel environments

Daire Byrne Daire.Byrne at framestore-cfc.com
Thu Apr 17 09:39:03 BST 2008


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]


I would imagine that the use of "wallclock" accounting is unimportant here. The accounting is correct it's just the ordering of the pending list to figure out what jobs to run next that seems to be the root cause. This behaviour must exist for everyone using the share-tree and parallel environments. The share-tree algorithm just doesn't take into account the slots each job is asking for. I suppose for the most part it goes unnoticed as long as there is an even distribution of jobs asking for different slot quantities in each project.

Daire


----- "Daire Byrne" <Daire.Byrne at framestore-cfc.com> wrote:

> Hi,
> 
> We want to use wallclock accounting such that our share tree
> essentially defines the share of slots between projects and not the
> CPU time used. We have enabled the following params in the conf:
> 
>   execd_params                 ACCT_RESERVED_USAGE=true \
>                                SHARETREE_RESERVED_USAGE=true
> 
> Everything seems to work as expected for normal single slot jobs (no
> PE's). We then setup a PE to deal with SMP (multi-threading jobs) like
> this:
> 
>   pe_name           pe-threaded
>   slots             100000
>   user_lists        NONE
>   xuser_lists       NONE
>   start_proc_args   /bin/true
>   stop_proc_args    /bin/true
>   allocation_rule   $pe_slots
>   control_slaves    FALSE
>   job_is_first_task FALSE
>   urgency_slots     min
> 
> We created three projects in the share-tree (A, B, C) all with equal
> shares (33/33/33%) and submitted lots of jobs like so:
> 
>   ( for i in `seq 1 20000`; do qsub -q farm.q@@hosts-rr -P A -N proj_A
> -b y 'sleep 240'; done ) &
>   ( for i in `seq 1 20000`; do qsub -q farm.q@@hosts-rr -P B -N proj_B
> -pe pe-threaded 2 -b y -S /bin/sh 'sleep 240'; done ) &
>   ( for i in `seq 1 20000`; do qsub -q farm.q@@hosts-rr -P C -N proj_c
> -pe pe-threaded 4 -b y -S /bin/sh 'sleep 240'; done ) &
> 
> After an hour or so (halflife 1 hour) the system reaches a steady
> state. However the total slots in use by each project looks a bit
> odd:
> 
>   A=921, B=1082, C=1692 (~3,780 slots available)
> 
> The share tree reports the following as the "actual" resource usage:
> A=20%, B=35%, C=45% which seems reasonable considering the steady
> state slot usage. But why is the system not trying to correct for the
> imbalance? The share tree knows the usage is imbalanced but it is like
> SGE is not aware that dispatching a job belonging to "C" (4 slots) is
> different to dispatching a job from A (1 slot). You can see this is
> the case when you first start the test as all three projects initially
> have the same number of jobs running but are actually using very
> different numbers of slots.
> 
> I looked at "control_slaves" and "job_is_first_task" but they just
> made the share-tree report the "actual" usage as 33/33/33% so the no.
> of running jobs were always 33/33/33%  too. Obviously the slot count
> is very different (1:2:4).
> 
> Am I missing something obvious here or does SGE not take into account
> the requested slots count when ordering and dispatching jobs? We are
> using GE 6.1u3.
> 
> Daire
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list