[GE users] V6 scheduler woes

Stephan Grell - Sun Germany - SSG - Software Engineer stephan.grell at sun.com
Thu May 12 09:17:05 BST 2005


Hello,

I am having still problem understanding the issue. Could you please post 
your scheduling
configuration and run the scheduler with profiling on? Both, the 
scheduler configuration
and the profiling output would be very usefull do understand what is 
going on.

You can enable profiling on the fly without restarting anything. Its 
output will be written
to the scheduler message file.

Thanks,
Stephan

McCalla, Mac wrote:

> There are about 23000 jobs pending.  About 650 jobs have finished in
>the last 30 minutes.
> The idea that setting the flush* variables to zero causes the scheduler
>to run continuously 
> originally came from the v5.3 man page sge_conf.  Perhaps this has
>impeded the understanding
> of the settings in v6 for me....8~)
>
> I will open an issue if I duplicate the problem .
>
> Thanks
>
>   
> 
>
>-----Original Message-----
>From: Reuti [mailto:reuti at staff.uni-marburg.de] 
>Sent: Wednesday, May 11, 2005 2:57 PM
>To: users at gridengine.sunsource.net
>Subject: RE: [GE users] V6 scheduler woes
>
>Hi,
>
>Quoting "McCalla, Mac" <macmccalla at hess.com>:
>
>  
>
>>Thanks for replying Reuti,
>>
>>I believe the setting of the 2 flush_* variables to zero causes the
>>scheduler to run continuously.  This was the intent since it is
>>    
>>
>running
>
>this shouldn't be, it should behave like the documentation said: run x
>seconds 
>after the events (submit/end of job) - or not to be triggered by this
>events in 
>case of the setting 0. Are there many jobs submitted and finishing?
>
>  
>
>>on a dedicated server.      This setting however,
>>in turn apparently causes the output from qconf -tsm to not terminate,
>>    
>>
>a
>  
>
>>situation that might have
>>dire consequences if it fills up the file system.  BTW, cycling the
>>qmaster/schedd did have the desired
>>effect of stopping the output from qconf -tsm.    If this is expected
>>behavior from qconf -tsm, then fine,
>>    
>>
>
>No, it should trace one run of the scheduler. If you see it running
>forver, I 
>would say it's an issue. - Reuti
>
>  
>
>>this is just a cautionary note.  If not, I will be glad to open it as
>>    
>>
>a
>  
>
>>problem.
>>
>>Regards,
>>
>>Mac   
>>
>>-----Original Message-----
>>From: Reuti [mailto:reuti at staff.uni-marburg.de] 
>>Sent: Wednesday, May 11, 2005 1:11 PM
>>To: users at gridengine.sunsource.net
>>Subject: Re: [GE users] V6 scheduler woes
>>
>>Hi,
>>
>>I don't get the point yuo want to achieve. The details for
>>"flush_submit_sec" 
>>and "flush_finish_sec" are explained in "man sched_conf", it triggers
>>    
>>
>a 
>  
>
>>scheduler run after these events.
>>
>>CU- Reuti
>>
>>
>>Quoting "McCalla, Mac" <macmccalla at hess.com>:
>>
>>    
>>
>>>Hello all,
>>>
>>>	I am running v6.0u4beta downloaded and built Apr 26 for a RHEL 3
>>>update 4 Linux .   In trying to diagnose why some user jobs were not
>>>being scheduled I issued the qconf -tsm command .  I also had
>>>flush_submit_sec and flush_finish_sec set to 0 .  The result appears
>>>      
>>>
>>to
>>    
>>
>>>be that
>>>the schedd_runlog file is being written to continuously for the last
>>>      
>>>
>>1.5
>>    
>>
>>>hours.  I have set flush_submit_sec and flush_finish_sec to 30 now
>>>      
>>>
>but
>  
>
>>>without apparent effect.  Anyone know how to turn this off short of
>>>bouncing the qmaster? (assuming that will do it?).  Thanks.
>>> 
>>>Mac McCalla 
>>>Geoscience Systems Consultant
>>>Amerada Hess Corporation
>>>500 Dallas St. , Houston, Texas  77002
>>>
>>>
>>>      
>>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>>    
>>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list