[GE users] SGE jobs in "qw" state

Mark_Johnson at URSCorp.com Mark_Johnson at URSCorp.com
Mon May 22 21:52:03 BST 2006


Kickstarted 16:21 27-Mar-2006
[urs1 at medusa ~]$ qrsh hostname
error: error waiting on socket for client to connect: Interrupted system
call
error: unable to contact qmaster using port 536 on host
"medusa.ursdcmetro.com"
[urs1 at medusa ~]$

Mark A. Johnson
URS Network Administrator
Gaithersburg, MD
Ph:  301-721-2231


                                                                              
 This e-mail and any attachments are confidential. If you receive this        
 message in error or are not the intended recipient, you should not retain,   
 distribute, disclose or use any of this information and you should destroy   
 the e-mail and any attachments or copies.                                    
                                                                              







                                                                           
             Chris Dagdigian                                               
             <dag at sonsorol.org                                             
             >                                                          To 
                                       users at gridengine.sunsource.net      
             05/22/2006 04:32                                           cc 
             PM                                                            
                                                                   Subject 
                                       Re: [GE users] SGE jobs in "qw"     
             Please respond to         state                               
             users at gridengine.                                             
               sunsource.net                                               
                                                                           
                                                                           
                                                                           
                                                                           





Stranger and stranger.

Lets see if you can run the simplest type of job, type "qrsh
hostname" (you want to run the 'hostname' command so don't substitute
your machine name)  and "qrsh date" and other simple things like
"qrsh uptime".

The qrsh command is a way to run a quick command on the least loaded
available node in the system. This is why running "qrsh hostname" a
few times is a good way to test out the basic SGE setup as for it to
work, many different cascading things all have to be working.

> workgroupcluster:~ root# qrsh hostname
> node002.cluster.private

A variation of that command is to run the 'hostname' or any other
simple unix command on a chosen remote node that you know should be
available.

In your setup this may be something like:

   qrsh -q all.q at compute-0-97.local  "/bin/hostname"

That command should run the "/bin/hostname" command on node-097 and
report back a hostname of "compute-0-97.local"





---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list