[GE users] qsub - connection refused

John Alberts alberts at calumet.purdue.edu
Fri May 27 22:00:05 BST 2005


 Reuti,
Thanks for the reply.  I was able to use both rsh gcmaster date and rsh
GcMaster date.
Just to be safe, I change my hostname to all lower case, as well as the
hosts file.
Still no luck with the qsub ...simple.sh file.
Gcmaster is an adminitration and submit host
Gcnode1 is an exec host.

When you said: 'Can you try your interactive commands with GCMaster?', did
you mean rsh gcmaster date?
Sorry, I am completely new to all of this and I am trying to learn.

As for the serial and parallel jobs ... Here is my goal.

I am creating a cluster with 1 master host for people to login to and submit
jobs, and I will have many slave nodes to help crunch numbers.  This is for
running Fluent CFD jobs and it uses MPI.

I think I'm on the right track, but I'm stuck now with not being able to
submit a job. :(

John


-----Original Message-----
From: Reuti [mailto:reuti at staff.uni-marburg.de] 
Sent: Friday, May 27, 2005 1:12 PM
To: users at gridengine.sunsource.net
Subject: Re: [GE users] qsub - connection refused

Hi John,

in the test you made I see the hostname in all lowercase, the error message
stated a different case. Can you try your interactive commands with
GCMaster? 
I'm not sure, how ssh will handle different spelling in the know_hosts file,
although they have the same TCP/IP address.

As long as you are using only serial jobs, there is no need for ssh or rsh
between the two machines at all. Also running parallel jobs can use the
built-in qrsh (if they are tightly integrated), and still there is no need
for rsh/ssh.

Both machines are defined as adminstration hosts, and your one node also as
exec host and submit host?

Cheers - Reuti


Quoting John Alberts <alberts at calumet.purdue.edu>:

> I have looked through the mailing list archives and the docs, but I 
> can't find any mention of this error.  I just installed gridengine 6 
> on a master and 1 node following the install docs on the website.  
> When I got to the part about verifying installation by submitting a test
job, I got an error.
> The command I used is:
>     qsub /opt/gridengine/examples/jobs/simple.sh
>  
> When I type that, I get the following message:
>     Connection refused
>     qsub: cannot connect to server GCMaster (errno=111)
>  
>  
> gcmaster is my master host and gcnode1 is my 1 execution node I have.  
> I am using ssh instead of rsh.  I have a symlink for rsh->ssh, and 
> both machines can communicate fine without passwords.
> The command:
>     rsh gcnode1 date
>     rsh gcmaster date
> both work fine.
>  
> Going through previous messages on the maililng list, I also tried the 
> qping command for both my master and exection node.  They both retuned 
> results without errors.
> I have no idea what to try next.
>  
>  
> John Alberts
> Technical Assistant for EMS
> Purdue University Calumet
> 219-989-2083
> http://public.xdi.org/=john.alberts
>  
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list