[GE users] CPU usage by array jobs

Iwona Sakrejda isakrejda at lbl.gov
Thu Jun 28 21:49:09 BST 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]



Ryan Thomas wrote:
> Give yourself some credit Dan. You're actually much better than you
> think.
>
> According to Andreas this was fixed in 6.0u11 as it was a duplicate of
> issue 2222.  So not only has it been fixed, but it's been shipped.
>
> Thanks!
>   
Sounds great!
I am about to update and was planning to update to 6.0u10, but the array 
job issue
is a big problem for my site. How stable is 6.0u11? Should I go for it 
right away?

Thanks a lot,

iwona


> -----Original Message-----
> From: Dan.Templeton at Sun.COM [mailto:Dan.Templeton at Sun.COM] 
> Sent: Thursday, June 28, 2007 2:06 PM
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] CPU usage by array jobs
>
> Ryan,
>
> Do note that this issue was opened 11 days ago.  We're good, but we're 
> not that good.  The priority it has is the priority with which it was 
> submitted (which is the default priority).  As you can see from the lack
>
> of additional comment beyond the initial bug report, we haven't 
> evaluated or assigned it yet.  During the last 11 days, we've been 
> working on getting the 6.0u11 release out the door.
>
> Daniel
>
> Ryan Thomas wrote:
>   
>> Someone issue #2298 to cover this,
>> (http://gridengine.sunsource.net/issues/show_bug.cgi?id=2298) 
>>
>> But it seems that this hasn't been given a very high priority.  
>>
>> I think that this is a major defect.  Array jobs dramatically increase
>> the scalability of the scheduler and they are also very convenient for
>> all my users.
>>
>> Perhaps if more people are a little more vocal about this being an
>> important issue it will get more attention.
>>  
>> -----Original Message-----
>> From: Iwona Sakrejda [mailto:isakrejda at lbl.gov] 
>> Sent: Wednesday, June 27, 2007 12:01 PM
>> To: users at gridengine.sunsource.net
>> Subject: Re: [GE users] CPU usage by array jobs
>>
>> Has any of the experts looked at the following problem? I wonder if
>>     
> you
>   
>> need more evedence and if yes of what kind? This problem is really
>> making
>> it impossible to get users stick with the array jobs....
>>
>> Thanks a lot,
>>
>> Iwona
>>
>>
>> Pascal Wassam wrote:
>>   
>>     
>>> I just conducted a test run.
>>>
>>> My notes:
>>>
>>> 4 nodes totaling to 7 cpus on all.q, each node has 4 slots in queue 
>>> config.
>>> SGE 6.1. All jobs are identical, cpuburn, set to run for 5 minutes.
>>>
>>> scheduler conf:
>>>
>>> policy_hierarchy OS
>>> weight_tickets_share 100000
>>>
>>> share tree:
>>>
>>> id=0
>>> name=template
>>> type=0
>>> shares=0
>>> childnodes=1
>>> id=1
>>> name=default
>>> type=0
>>> shares=100
>>> childnodes=NONE
>>>
>>> queue is disabled, and empty.
>>> 1000 individual jobs are queued as user pascal
>>> 1 array job of 1000 subjobs is queued as user ben
>>>
>>> usage is cleared (qconf -clearusage)
>>>
>>> at the starting line:
>>>
>>> Queued per user:
>>>   1000 pascal qw
>>>   1000 ben qw
>>>
>>> bang: qmod -e all.q
>>>
>>> 1 minute in:
>>>
>>> Running per user:
>>>      8 pascal r
>>>      8 ben r
>>> Queued per user:
>>>    992 pascal qw
>>>    992 ben qw
>>>
>>> A while later:
>>>
>>> Running per user:
>>>     10 pascal r
>>>      1 ben r
>>> Queued per user:
>>>    991 ben qw
>>>    973 pascal qw
>>>
>>> And it continues this way:
>>>
>>> Running per user:
>>>      8 pascal r
>>>      2 ben r
>>> Queued per user:
>>>    987 ben qw
>>>    952 pascal qw
>>>
>>> -Pascal
>>>
>>> Pascal Wassam wrote:
>>>     
>>>       
>>>> I would like to second all the experiences Iwona has written about 
>>>> here. I will also attempt to conduct some tests and present
>>>>         
> something
>   
>>>>       
>>>>         
>>   
>>     
>>>> that is repeatable for developers to play with.
>>>>
>>>> -Pascal
>>>>
>>>> Iwona Sakrejda wrote:
>>>>       
>>>>         
>>>>> Since this is a somehow different problem I gave it a new title.
>>>>>
>>>>> Rayson Ho wrote:
>>>>>         
>>>>>           
>>>>>>> Another problem I am having is that array jobs seem to be
>>>>>>>             
>>>>>>>               
>> overcharged
>>   
>>     
>>>>>>> when the usage is calculated (could you point me to the section
>>>>>>>               
> of
>   
>>>>>>>             
>>>>>>>               
>>   
>>     
>>>>>>> code that
>>>>>>> deals with it/ I'll be happy to read it). Looks like each array 
>>>>>>> job gets
>>>>>>> the CPU usage of the whole array. Array jobs are very helpful but
>>>>>>>               
>
>   
>>>>>>> users are
>>>>>>> fleeing from them in droves.....
>>>>>>>             
>>>>>>>               
>>>>>> How to reproduce it?? Is it a parallel or serial job??
>>>>>>           
>>>>>>             
>>>>> It happens to serial jobs. I have not done thorough studies yet,
>>>>>           
> but
>   
>>>>>         
>>>>>           
>>   
>>     
>>>>> I see that
>>>>> usage for owners of array jobs greatly exceeds what I estimate it 
>>>>> should be.
>>>>>
>>>>> Also when I clear usage, then only the usage from that moment
>>>>>           
> should
>   
>>>>>         
>>>>>           
>> be
>>   
>>     
>>>>> taken into account - right? And I see that a user who has an array 
>>>>> jobs, gets
>>>>> right away usage that exceeds what he has running at that point.
>>>>>
>>>>> Another shred of evidence is that when they switch from array jobs
>>>>>         
>>>>>           
>> to
>>   
>>     
>>>>> individual jobs, they get a throughput that they feel is consistent
>>>>>           
>
>   
>>>>> with their share.
>>>>> If they use arrays their throughput dives.
>>>>>
>>>>> I'll try to come with a clean example with numbers.  It is in 6.0u4
>>>>>           
>
>   
>>>>> so since
>>>>> I have to upgrade anyway I was postponing more studies hoping that 
>>>>> the upgrade will
>>>>> fix the problem. On the other hand it might not and it really 
>>>>> increases  the load
>>>>> when instead of 1 array job with  1000 members I get 1000 jobs.....
>>>>>
>>>>> And today I noticed that discussion about shares and CPU
>>>>>           
> consumption
>   
>>>>>         
>>>>>           
>> so
>>   
>>     
>>>>> I hoped the right expert might be watching and it would be easy for
>>>>>           
>
>   
>>>>> him to look at it...
>>>>>
>>>>> Iwona
>>>>>
>>>>>
>>>>>         
>>>>>           
>> ---------------------------------------------------------------------
>>   
>>     
>>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>> For additional commands, e-mail:
>>>>>           
> users-help at gridengine.sunsource.net
>   
>>>>>         
>>>>>           
> ---------------------------------------------------------------------
>   
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>
>>>>       
>>>>         
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>     
>>>       
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>   
>>     
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>   

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list