[GE users] Hadoop-Sge Error

adarsh adarsh.sharma at orkash.com
Mon Nov 29 05:40:08 GMT 2010


Dear all,

I am using Hadoop-0.20.2  on 4 nodes ( 1 ( master/qmaster ) & 3 ( Slaves/sgeexecd hosts ).
QMON shows all nodes with their loads.

./sgeexecd command shows
root at ws30-pank-lin:~# ps aux | grep sge
sgeadmin  3673  0.2  0.0  49264  2112 ?        Sl   10:41   0:00 /opt/sge-root/bin/lx24-amd64/sge_execd
root      3688  0.0  0.0   7620   888 pts/0    S+   10:41   0:00 grep --color=auto sge

but in their logs/messages all execution host shows

11/29/2010 10:08:53|  main|ws36-test-lin|I|starting up GE 6.2u5 (lx24-amd64)
11/29/2010 10:20:54|  main|ws36-test-lin|W|load sensor exited with exit status = 127
11/29/2010 10:21:33|  main|ws36-test-lin|W|[load_sensor 4724] fflush failed [Broken pipe]
11/29/2010 10:21:34|  main|ws36-test-lin|W|load sensor exited with exit status = 127
11/29/2010 10:24:13|  main|ws36-test-lin|W|[load_sensor 4770] fflush failed [Broken pipe]
11/29/2010 10:27:33|  main|ws36-test-lin|W|[load_sensor 4830] fflush failed [Broken pipe]
11/29/2010 10:27:34|  main|ws36-test-lin|W|load sensor exited with exit status = 127
11/29/2010 10:28:13|  main|ws36-test-lin|W|[load_sensor 4845] fflush failed [Broken pipe]
11/29/2010 10:28:14|  main|ws36-test-lin|W|load sensor exited with exit status = 127

I followed Hadoop Troubleshooting Guide in SGE
and my qhost -F | grep hdfs  command shows nothing.

[root at ws37-mah-lin lx24-amd64]# ./qhost -F | grep hdfs
[root at ws37-mah-lin lx24-amd64]#

I think my Sge didn't configured properly but qhost command works properly simple.sh run completely.
[root at ws37-mah-lin lx24-amd64]# ./qhost
HOSTNAME                ARCH         NCPU  LOAD  MEMTOT  MEMUSE  SWAPTO  SWAPUS
-------------------------------------------------------------------------------
global                  -               -     -       -       -       -       -
ws34-rak-lin            lx24-amd64      4  0.18    5.7G    1.3G    4.0G     0.0
ws36-test-lin           lx24-amd64      4  0.08    7.7G  736.8M   15.8G     0.0
ws37-user-lin           lx24-amd64      4  0.04    7.7G  407.9M   15.8G     0.0

Thanks & Regards
Adarsh Sharma

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=300123

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list