[GE users] Scheduler dies like a hell

Viktor Oudovenko udo at physics.rutgers.edu
Fri May 20 04:34:50 BST 2005


Thank you , Ron for your prompt reply!

Thank for the advices. 
Especially for the last one!
I an running Beowulf cluster with more than 300 nodes. And unfortunately it
was only one or two occasions (power failures) when for a some minutes queue
was free! :)  It is almost like 24x7 whole year!
I'll try with jobs directory. It could be that the solution is there.
Regards,
v

> -----Original Message-----
> From: Ron Chen [mailto:ron_chen_123 at yahoo.com] 
> Sent: Thursday, May 19, 2005 23:21
> To: users at gridengine.sunsource.net
> Subject: RE: [GE users] Scheduler dies like a hell
> 
> 
> --- Viktor Oudovenko <udo at physics.rutgers.edu> wrote:
> > I just switched of reporting but it did not help.
> > Should I look into : accounting  file?
> 
> The accounting file is a simple, pure ASCII file. The
> qmaster appends finished job info to it. You should be
> able to see the file content using vi. However, I
> don't think the scheduler reads the accounting file at
> all!
> 
>  
> > But the jobs will be lost? Or I can move them back
> > after sgemaster restart and jobs reappear?
> 
> Yes, I think you can move them back. BTW, backup the
> directory first.
>  
> > Could you give me the command (plz) usually I use
> > qmon to manage the SGE.
> 
> Use qconf.
> 
> There are lots of things to dump, "qconf |grep show"
> will give you a list:
> 
> % qconf |grep show
>    [-sc ]                     show complex attributes
>    [-scal calendar_name]      show given calendar
> ...
> 
> > A few days ago I made a copy of everything I can try
> > to see whether the same
> > problem existed already.
> > And one more question: can one do backup with
> > classic spooling? I meet
> > somewhere discussion that backup command did not
> > work. Am I wrong?
> 
> Just backup the whole directory when no jobs are
> running.
> 
>  -Ron
> 
> 
> 
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list