[GE users] Hadoop Error

reuti reuti at staff.uni-marburg.de
Tue Nov 2 18:38:34 GMT 2010


Hi,

Am 02.11.2010 um 12:02 schrieb adarsh:

> Dear all,
> I'm able to run SGE on 4 nodes and also follow Hadoop Integration tutorial.
> But when I ran any Hadoop job , it results in Pending state.
> The error I find after reading logs is that my all.q is in error state.
> 
> [root at ws-test lx24-amd64]# ./qstat -f
> queuename                      qtype resv/used/tot. load_avg 
> arch          states
> --------------------------------------------------------------------------------- 
> 
> all.q at ws33-shiv-lin            BIP   0/0/2          0.00     
> lx24-amd64    E
> --------------------------------------------------------------------------------- 
> 
> all.q at ws34-rak-lin             BIP   0/0/4          0.83     
> lx24-amd64    E
> --------------------------------------------------------------------------------- 
> 
> all.q at ws37-user-lin            BIP   0/0/4          5.80     
> lx24-amd64    E

first you have to remove the error condition of the queues:

$ qmod -cq "*"

and investigate, how this happened. Anything in the qmaster's messages file in /usr/sge/default/spool/qmaster/messages?

-- Reuti


> ############################################################################ 
> 
> - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
> ############################################################################ 
> 
>     4 0.60500 STDIN      root         qw    11/02/2010 13:04:56     
> 3            8 0.55500 STDIN      root         qw    11/02/2010 
> 14:41:55     2      
> And another command result this
> 
> hadoop at ws37-user-lin:~$ /opt/sge-root/bin/lx24-amd64/qalter -w v 7
> Job 7 queue instance "all.q at ws33-shiv-lin" dropped because it is 
> temporarily not available
> Job 7 queue instance "all.q at ws34-rak-lin" dropped because it is 
> temporarily not available
> Job 7 queue instance "all.q at ws37-user-lin" dropped because it is 
> temporarily not available
> Job 7 cannot run in PE "hadoop" because it only offers 0 slots
> verification: no suitable queues
> 
> Please help me to get out from the problem.
> 
> Thanks in Advance
> Adarsh Sharma
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=292077
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=292233

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list