[GE users] Executed CPU time does not match share tree

Mark Hills mark.hills at framestore-cfc.com
Wed Apr 2 10:51:39 BST 2008


On Mon, 31 Mar 2008, Reuti wrote:

> Am 31.03.2008 um 18:07 schrieb james.vanns at framestore-cfc.com:
>> We are having a little trouble with our expectations of the Share Tree
>> policy. Where jobs differ in execution time per node, the total CPU time
>> used by each leaf node does not match the share (and shows no sign of
>> converging).
>> 
>> Here's a small demonstration:
>>
>>   root
>>   |
>>   +-- project_A: 500 shares
>>   |              8 minute jobs ("sleep 480")
>>   |
>>   +-- project_B: 500 shares
>>                  4 minute jobs ("sleep 240")
>
> a "sleep" will not generate much CPU load (just when starting the job) which 
> can be measured. Can you try a) with an loop computing some floating 
> nonsense, I use:
>
> int main(void){
>   float x;
>   long  i,j;
>   for (j=0;j<=10000;j++)
>       for (i=0;i<=10000;i++)
>           x=3.1415926*i+i+i*i*2.7182818;
>   return 0;
> }
>
> or b) setting "ACCT_RESERVED_USAGE" in the "execd_params" in SGE's 
> configurartion.
>
> Is the result the same?

Yes.

As our tests are sharing hardware which is running jobs from another 
system, we can't currently do such a large test under high load.

But we have done b, and re-run the test with the ACCT_RESERVED_USAGE 
option enabled. This is the behaviour we would require from SGE.

We ensured that SGE was restarted so the execd were updated with the new 
settings. However, the results are virtually identical and the project 
with longer running jobs is getting more CPU time.

Graphs for test1 (the original test) and test 6 (with 
ACCT_RESERVED_USAGE=true) are attached.

I plan to re-run the tests with SHARETREE_RESERVED_USAGE, although the man 
page seems to suggest that ACCT_RESERVERED_USAGE would cover this already.

Are there any other configuration parameters we might need to consider to 
ensure the shares match the execution?

Mark

######################################################################
ISS has detected a compressed file attached to this message.
Please note that compressed files can be used to spread computer viruses.
If you were not expecting this file you should not open the attachment
even if you know that the sender is genuine.

ISS Helpdesk
helpdesk at leeds.ac.uk
+44 113 343 3333
######################################################################




    [ Part 2, ""  Application/X-GZIP (Name: "test6_cumulative.ps.gz") 5.2 ]
    [ KB. ]
    [ Unable to print this part. ]


    [ Part 3, ""  Application/X-GZIP (Name: "test1_cumulative.ps.gz") 5.1 ]
    [ KB. ]
    [ Unable to print this part. ]


    [ Part 4: "Attached Text" ]

    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net



More information about the gridengine-users mailing list