[GE users] grid engine problem
Harald.Pollinger at Sun.COM
Tue Nov 13 10:50:04 GMT 2007
[ The following text is in the "ISO-8859-15" character set. ]
[ Your display is set for the "ISO-8859-10" character set. ]
[ Some special characters may be displayed incorrectly. ]
Sandeep, Patel(IE10) wrote:
> Actually I have one windows system and inside that I have installed
> vmware. In that I m running two RHEL virtual machines. One of the
> virtual RHEL is my master host other is execution host. And the mother
> window os is one execution host. Is it the problem?
This should not be a problem. Maybe you will have to change some settings.
> By putty software I m able to connect from windows execution host to
> RHEL master host through SSH. But by telnet it is showing some network
> error? How can I fix this?
I think you got me wrong. Try to connect to the qmaster itself, not to
the telnetd of the qmaster host. Use
# telnet gridserver.sunnonegrid-bangalore.com 536
It should print
Trying [IP-Adress of gridserver.sunnonegrid-bangalore-com]...
Connected to gridserver.sunnonegrid-bangalore.com.
Escape character is '^]'.
> -----Original Message-----
> From: Harald.Pollinger at Sun.COM [mailto:Harald.Pollinger at Sun.COM]
> Sent: Tuesday, November 13, 2007 2:50 PM
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] grid engine problem
> Sandeep, Patel(IE10) wrote:
>> I checked messages and I got something like this
>> 11/13/2007 12:38:16|execd|ie10dtdc3zl1s|E|commlib error: endpoint is
>> unique error (endpoint "ie10dtdc3zl1s.global.ds.honeywell.com/execd/1"
>> is already connected)
> Are there more than one "sge_execd" instances running on that host?
> If yes, please kill all and start only one of them again.
>> 11/13/2007 12:38:16|execd|ie10dtdc3zl1s|E|getting configuration:
>> to contact qmaster using port 536 on host
> Is there a firewall running somewhere on or between the execution host
> and the master host?
> Is it possible to connect from the execution host to the qmaster using
>> 11/13/2007 12:38:19|execd|ie10dtdc3zl1s|E|can't get configuration from
>> qmaster -- backgrounding
>> How to solve this problem
>> -----Original Message-----
>> From: Ravichandra.Nallan at Sun.COM [mailto:Ravichandra.Nallan at Sun.COM]
>> Sent: Tuesday, November 13, 2007 12:25 PM
>> To: users at gridengine.sunsource.net
>> Subject: Re: [GE users] grid engine problem
>> Hi Sandeep,
>> From the qstat o/p it is evident (states au) that the execd on host
>> ie10dtdc3z11s.<something....> is not up. Check if there are any
>> for the execd not coming up. (check
>> $SGE_ROOT/$SGE_CELL/spool/<hostname>/messages ).
>> This is the reason why the jobs are not scheduled to this host.
>> (For info on queue states check qstat(1) man page, you could also see
>> qstat that the load_avg/arch is -NA- !! ).
>> Hope this helps.
>> Sandeep, Patel(IE10) wrote:
>>> 1. I have my *master *host in RHEL.
>>> 2. I have two *execution* host
>>> A. one is on *windows *
>>> B. other one is on *RHEL*
>>> 3. When I m submitting the job *simple.sh(4times) , *when I m typing
>>> the command *qstat -f , * then the job is always going to the RHEL
>>> execution host for execution because the
>>> Used by/total *is 2/2* for RHEL , but for *windows 0/2.the* jobs are
>>> *pending* for some time and *later taken by* RHEL execution host.
>>> 4. It means the job is not distributed among the hosts *!!!!*
>>> 5. How can I solve this?
>>> 6. In this connection I have *attached* some *screen shots*. Can u
>>> please check it out?
Sun Microsystems GmbH Harald Pollinger
Dr.-Leo-Ritter-Str. 7 N1 Grid Engine Engineering
D-93049 Regensburg Phone: +49 (0)941 3075-209 (x60209)
Germany Fax: +49 (0)941 3075-222 (x60222)
mailto:harald.pollinger at sun.com
Sitz der Gesellschaft: Sun Microsystems GmbH, Sonnenallee 1,
Amtsgericht Muenchen: HRB 161028
Geschaeftsfuehrer: Wolfgang Engels, Dr. Roland Boemer
Vorsitzender des Aufsichtsrates: Martin Haering
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net
More information about the gridengine-users