[GE users] weird "orders queue version is not uptodate" messages in qmaster log

Stephan Grell - Sun Germany - SSG - Software Engineer stephan.grell at sun.com
Fri Mar 4 09:57:00 GMT 2005



Xavier MACHENAUD wrote:

>Stephan,
>
>Just one more question : Do you know how to recover from the blocked 
>state (when queued jobs are not started anymore)?
>
Restart the scheduler. Could you enable the profiling in the scheduler
and post the output for the case, that the scheduler is blocked?

Thanks,

Stephan

>
>Xavier
>
>Terry Lalonde wrote:
>
>  
>
>>I too am seeing this with u1.  
>>
>>I do not plan to upgrade soon.
>>
>> 
>>
>>    
>>
>>>-----Original Message-----
>>>From: Stephan Grell - Sun Germany - SSG - Software Engineer
>>>[mailto:stephan.grell at sun.com]
>>>Sent: Thursday, March 03, 2005 5:03 AM
>>>To: users at gridengine.sunsource.net
>>>Subject: Re: [GE users] weird "orders queue version is not uptodate"
>>>messages in qmaster log
>>>
>>>
>>>
>>>Xavier MACHENAUD wrote:
>>>
>>>   
>>>
>>>      
>>>
>>>>FYI, I also got the messages in u1 and reach to point where no jobs
>>>>     
>>>>
>>>>        
>>>>
>>were
>> 
>>
>>    
>>
>>>>scheduled anymore.
>>>>
>>>>I upgraded to u3 4 days ago. I still have the messages but, so far,
>>>>didn't reach the blocking state.
>>>>But I'me not very confident about not being blocked again :-(
>>>>
>>>>     
>>>>
>>>>        
>>>>
>>>Well, if you are brave, you could try the maintrunk. :-)
>>>This problem should be fixed. Would be nice to have an external test
>>>   
>>>
>>>      
>>>
>>for
>> 
>>
>>    
>>
>>>it.
>>>
>>>Stephan
>>>
>>>   
>>>
>>>      
>>>
>>>>Xavier
>>>>
>>>>Stephan Grell - Sun Germany - SSG - Software Engineer wrote:
>>>>
>>>>
>>>>
>>>>     
>>>>
>>>>        
>>>>
>>>>>Hi Terry,
>>>>>
>>>>>its a way to ensure, that the scheduler uses the latest data for
>>>>>its scheduling decision. Every object has a version number. When
>>>>>the scheduler generates updates for an object (example: job start),
>>>>>it includes the version number into that order to inform the qmaster
>>>>>about the basis of its decision.
>>>>>
>>>>>Unfortunately, there is a bug in u3, which delays the delivery of
>>>>>       
>>>>>
>>>>>          
>>>>>
>>update
>> 
>>
>>    
>>
>>>>>events. Therefore, the scheduler is not working on the most recent
>>>>>data, and the qmaster logs the error messages you have noticed.
>>>>>
>>>>>This can cause:
>>>>>- losing usage in the sharetree.
>>>>>- trying to start a job twice
>>>>>- ignoring job start orders
>>>>>
>>>>>The likely hod of a delay grows with the size of the share tree.
>>>>>
>>>>>Cheers,
>>>>>Stephan
>>>>>
>>>>>Terry Lalonde wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>       
>>>>>
>>>>>          
>>>>>
>>>>>>Interesting:  I just noticed the same thing today???
>>>>>>
>>>>>>02/28/2005 15:49:17|qmaster|wstlalonde|E|orders user/project
>>>>>>         
>>>>>>
>>>>>>            
>>>>>>
>>version
>> 
>>
>>    
>>
>>>>>>(88808) is not uptodate (88809) for user/project "ddodd"
>>>>>>02/28/2005 15:49:32|qmaster|wstlalonde|E|orders user/project
>>>>>>         
>>>>>>
>>>>>>            
>>>>>>
>>version
>> 
>>
>>    
>>
>>>>>>(88808) is not uptodate (88809) for user/project "ddodd"
>>>>>>
>>>>>>repeated over and over.
>>>>>>
>>>>>>It's some accounting check I think.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>         
>>>>>>
>>>>>>            
>>>>>>
>>>>>>>-----Original Message-----
>>>>>>>From: Xavier MACHENAUD [mailto:xavier.machenaud at st.com]
>>>>>>>Sent: Monday, February 28, 2005 6:34 AM
>>>>>>>To: users at gridengine.sunsource.net
>>>>>>>Subject: [GE users] weird "orders queue version is not uptodate"
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>           
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>>messages
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>         
>>>>>>
>>>>>>            
>>>>>>
>>>>>>>in qmaster log
>>>>>>>
>>>>>>>Hi,
>>>>>>>
>>>>>>>I'm seeing these kind of messages in my qmater log :
>>>>>>>
>>>>>>>02/28/2005 12:30:45|qmaster|crxu71|E|orders queue version (1668)
>>>>>>>           
>>>>>>>
>>>>>>>              
>>>>>>>
>>is
>> 
>>
>>    
>>
>>>>>>>
>>>>>>>
>>>>>>>           
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>>not
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>         
>>>>>>
>>>>>>            
>>>>>>
>>>>>>>uptodate (1669) for queue "run at crxu76"
>>>>>>>02/28/2005 12:30:51|qmaster|crxu71|E|orders queue version (1670)
>>>>>>>           
>>>>>>>
>>>>>>>              
>>>>>>>
>>is
>> 
>>
>>    
>>
>>>>>>>
>>>>>>>
>>>>>>>           
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>>not
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>         
>>>>>>
>>>>>>            
>>>>>>
>>>>>>>uptodate (1671) for queue "run at crxu76"
>>>>>>>
>>>>>>>Do you know what's the problem?
>>>>>>>
>>>>>>>Thanks,
>>>>>>>
>>>>>>>Xavier
>>>>>>>
>>>>>>>
>>>>>>>           
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>>--------------------------------------------------------------------
>>>>>>         
>>>>>>
>>>>>>            
>>>>>>
>>-
>> 
>>
>>    
>>
>>>>>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>>>>For additional commands, e-mail:
>>>>>>>           
>>>>>>>
>>>>>>>              
>>>>>>>
>>users-help at gridengine.sunsource.net
>> 
>>
>>    
>>
>>>>>>>
>>>>>>>
>>>>>>>           
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>---------------------------------------------------------------------
>>>>>       
>>>>>
>>>>>          
>>>>>
>>>>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>>>For additional commands, e-mail:
>>>>>>         
>>>>>>
>>>>>>            
>>>>>>
>>users-help at gridengine.sunsource.net
>> 
>>
>>    
>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>         
>>>>>>
>>>>>>            
>>>>>>
>>>>---------------------------------------------------------------------
>>>>     
>>>>
>>>>        
>>>>
>>>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>       
>>>>>
>>>>>          
>>>>>
>>>>---------------------------------------------------------------------
>>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>>
>>>>
>>>>
>>>>     
>>>>
>>>>        
>>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>   
>>>
>>>      
>>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>> 
>>
>>    
>>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list