[GE users] Fair share scheduling problem

Stephan Grell - Sun Germany - SSG - Software Engineer stephan.grell at sun.com
Tue Jan 3 10:21:32 GMT 2006



Jean-Paul Minet wrote On 01/03/06 10:56,:

>Stephan,
>
>I am using 6.0u6 on Solaris 10 (Sunfire V440).  Working nodes are bi-opteron 
>(SUSE 9.0).  The cluster has just been handed-over from Sun, which did the SGE 
>default install.  I am trying now to get a simple fairshare policy setup (all 
>users having equal shares over a period of time).
>
>Note that host sorting for scheduling is based on "slots" (also defined as 
>consumable resources for exec hosts) to fill up hosts as much as possible with 
>sequential jobs (so as to leave empty nodes for MPI/OpenMP jobs).
>
>We don't use projects, just a flat "default" user tree. So there are no regular 
>updates nor modifications.
>
>See my previous mail (with sched config, share tree config and qstat) for more info.
>  
>
Could you please resend the information. I did not get the "files?".
Looking at all the
emails I have, I can only see, that you have an issue.


>Which other info would be helpful/meaningful?
>
Nothing so far....

Stephan

>
>Thks
>
>Jean-paul
>
>Stephan Grell - Sun Germany - SSG - Software Engineer wrote:
>  
>
>>Hi Paul,
>>
>>which version are you using? We had some bugs in that area. And to
>>answer your
>>question, yes, they are related. The question is, why the user/project
>>updates
>>are failing.
>>
>>Do you reconfigure your users/projects on a regular basis?
>>
>>Could you give use some insight into your configuration?
>>
>>Kind Regards,
>>Stephan
>>
>>Jean-Paul Minet wrote On 01/02/06 17:38,:
>>
>>
>>    
>>
>>>Hello all,
>>>
>>>I have setup SGE to use fair share scheduling, with a fair share tree composed 
>>>of a "default" leaf and a "test" node with two users.  Several users (falling 
>>>under test node or default leaf) have running jobs.  For all of them, "Actual 
>>>Resource Share", "Targeted Resource Share" and "Combined Usage" remain at 0. 
>>>Also, a "qstat -ext" shows 0 as stkct for all jobs.
>>>
>>>Note that I am getting messages like
>>>
>>>01/02/2006 12:03:58|qmaster|lmsp|E|orders user/project version (63119) is not 
>>>uptodate (63120) for user/project "bricteux"
>>>
>>>in the qmaster message file.  Would this be linked to the problem?  How could I 
>>>move forward?
>>>
>>>Any help will be appreciated.
>>>
>>>Rgds
>>>
>>>Jean-Paul
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>>
>>>
>>>      
>>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>>
>>    
>>
>
>  
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list