[GE users] GridEngine fails to start/run

Brady Catherman bradyc at uidaho.edu
Mon Mar 13 19:13:15 GMT 2006


What is odd though is that there are 1226 jobs queued on Phlegathon  
(our 32 node Mac cluster) and it is having all sorts of problems.  
qstat fails, errors on startup and such. At the same time there is  
10K jobs on our Linux cluster and it is running like a champ =)

This is classic spooling with 6.0 u7.


# qstat
failed receiving gdi request

#


On Mar 13, 2006, at 10:56 AM, McCalla, Mac wrote:

>  Hi Brady,
>
> 	Are you sure the processes of sge_qmaster and sge_schedd have
> actually failed?
> If our system is loaded up (lots of jobs), I see these messages when
> (re)starting qmaster/schedd,
> but eventually (may take several minutes), the scheduler will register
> with qmaster
> and things are fine.
>
> BTW Is this a classic spooling or BDB install?  and what version of  
> SGE?
>
> Mac McCalla
> Geoscience Systems Consultant
> Amerada Hess Corporation
> 500 Dallas St. , Houston, Texas  77002
> Office: 713 609-5434
>
>
> -----Original Message-----
> From: Brady Catherman [mailto:bradyc at uidaho.edu]
> Sent: Monday, March 13, 2006 12:41 PM
> To: users at gridengine.sunsource.net
> Subject: [GE users] GridEngine fails to start/run
>
> On our Mac OS 10.4 system Grid Engine just started failing to
> startup. This same exact build was working fine up until today.
> Everything has started getting gdi failures. I have no clue what
> would have cause grid engine to just start failing all of a sudden.
> There are no errors in the qmaster/messages file so I have no clue
> where to start troubleshooting..
>
> # /opt/sge/default/common/sgemaster start
>     starting sge_qmaster
>     starting sge_schedd
> daemonize error: timeout while waiting for daemonize state
> error: getting configuration: failed receiving gdi request
>
> # qstat
> failed receiving gdi request
>
>
> This is Grid Engine 6.0u7 on Mac OS 10.4.5
>
> Anybody have any ideas where to start with this one?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list