[GE users] RE: Unable to contact qmaster

Dan Gruhn Dan.Gruhn at Group-W-Inc.com
Mon Apr 4 14:13:55 BST 2005


    [ The following text is in the "iso-8859-13" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Chakravarthi,

Using the queue name you've given, you are submitting your job to a
specific queue instance.  That is, you are submitting your job to queue
"all.q" but only on host "master_node".  If you want to submit your job
so that it will go to any host in "all.q" you need to put in all.q not
"all.q at master_node".

Please do this to help:

1) Email the results of the command "qconf -sq all.q"
2) Give us what version of SGE you are running


Here is a write up I've done on this:


Queues are containers for different categories of jobs. Queues provide
the corresponding resources for concurrent execution of multiple jobs
that belong to the same category.

In SGE, a queue can be associated with one host or with multiple hosts.
Because queues can extend across multiple hosts, they are called cluster
queues. Cluster queues enable managing a cluster of execution hosts by
means of a single cluster queue configuration and name.

Each host that is associated with a cluster queue receives an instance
of that cluster queue, which resides on that host. These instances are
known as queue instances. Within any cluster queue, each queue instance
can be configured separately. By configuring individual queue instances,
a heterogeneous cluster of execution hosts can be managed by means of a
single cluster queue configuration and name. 

When modifying a cluster queue, all of its queue instances are modified
simultaneously. Within a cluster queue, differences in the configuration
of queue instances can be specified by separately adding the associated
host and modifying it?s attributes. Consequently, a typical setup might
have only a few cluster queues, and the queue instances controlled by
those cluster queues remain largely ignored.

NOTE: The distinction between cluster queues and queue instances is
important. For example, jobs always run in queue instances, not in
cluster queues.

When configuring a cluster queue, any combination of the following host
objects can be associated with the cluster queue:


      * One execution host
        
      * A list of separate execution hosts
        
      * One or more host groups
        
        

A host group is a group of hosts that can be treated collectively as
identical. Host groups enable management of multiple hosts by means of a
single host group configuration. For more information about host groups,
see ?Configuring Host Groups With QMON? in chapter 1 of the
Administration Guide.

When associating individual hosts with a cluster queue, the name of the
resulting queue instance on each host combines the cluster queue name
with the host name. The cluster queue name and the host name are
separated by an @ sign. For example, if associating the host myexechost
with the cluster queue myqueue, the resulting queue instance is called
myqueue at myexechost. 

When associating a host group with a cluster queue, a queue domain is
created. Queue domains enable management of groups of queue instances
that are part of the same cluster queue and whose assigned hosts are
part of the same host group. A queue domain name combines a cluster
queue name with a host group name, separated by an @ sign. For example,
if the host group @myhostgroup (host group names must start with an @)
is associated with the cluster queue myqueue, the resulting queue domain
is myqueue@@myhostgroup.



Adding Queues

Using qmon, click the ?Queue Control? button and then click the ?Add?
button. First, enter the ?Queue Name? (by convention, queue names always
end in .q as in ?fast.q?). Choose the name with care, it cannot be
changed later. 

Next, enter a host or host group name in the ?New Host/Hostgroup? box
and click the red left arrow. Enter as many hosts or host groups as
needed, their names will appear in the ?Hostlist? box at the top left of
the window.

The ?@/? listing in the ?Attributes for Host/Hostgroup? list on the
lower left of the window denotes attributes that are the default for
each host or hostgroup in this queue. Hosts or host groups from the
Hostlist box can be added to this listing and their attributes specified
differently from the defaults by entering their name in the ?New
Host/Hostgroup? box and clicking the red up arrow.

Dan

On Mon, 2005-04-04 at 08:51, Chakravarthi_Mohan wrote:

> Dan,
> 
>  
> 
> How to include a node to an existing queue?
> 
>  
> 
> For example,
> 
>  
> 
> My queue name is all.q at master_node,
> 
>  
> 
> Now how to include my node to this queue (all.q at master_node,).
> 
>  
> 
> Pls provide detail steps.
> 
>  
> 
>  
> 
> -Chakravarthi
> 
>  
> 
>                                    
> ______________________________________________________________________
> 
> From: Dan Gruhn [mailto:Dan.Gruhn at Group-W-Inc.com] 
> Sent: Monday, April 04, 2005 5:48 PM
> To: users at gridengine.sunsource.net
> Subject: RE: [GE users] RE: Unable to contact qmaster
> 
> 
>  
> 
> Chakravarthi,
> 
> You don't have to define a SEPARATE queue for each node, but each node
> needs to be part of the queue specified when you submit a job if you
> want the job to possibly run on that node.
> 
> For example, if you have a queue named myqueue.q, if you enter:
> 
>         qsub -q myqueueu.q ... myjobscript
> 
> then myjobscript can run on any node whose host name shows up when you
> run the qconf command:
> 
>        qconf -sq myqueue.q
> 
> Dan
> 
> 
> 
> On Mon, 2005-04-04 at 07:48, Chakravarthi_Mohan wrote: 
> 
> 
> After sharing the file using NFS, the problem of executor installation was
> solved.
>  
> Now when I submit two or more jobs from the master or from the executor
> machine. The job is running only on the master node and not on the executor
> machine.
>  
> Do I need to define queue for each node.
>  
> Pls clarify this.
>  
>  
> Thanks & Regards
> Chakravarthi Mohan
>  
>  
>  
>  
> -----Original Message-----
> From: raysonho at eseenet.com [mailto:raysonho at eseenet.com] 
> Sent: Friday, April 01, 2005 8:37 PM
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] RE: Unable to contact qmaster
>  
> >Here I have one basic question, How does the executor knows whereabouts
> of
> >qmaster?
>  
> The execds look at the file:
>  
> $SGE_ROOT/$SGE_CELL/common/act_qmaster
>  
> Is your $SGE_ROOT directory shared, or is it local??
>  
> Rayson
>  
>  
>  
> ---------------------------------------------------------
> Get your FREE E-mail account at http://www.eseenet.com !
>  
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> ************************************************************************** 
> This email (including any attachments) is intended for the sole use of the
> intended recipient/s and may contain material that is CONFIDENTIAL AND
> PRIVATE COMPANY INFORMATION. Any review or reliance by others or copying or
> distribution or forwarding of any or all of the contents in this message is
> STRICTLY PROHIBITED. If you are not the intended recipient, please contact
> the sender by email and delete all copies; your cooperation in this regard
> is appreciated.
> **************************************************************************
>  
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> 
> ************************************************************************** 
> 
> This email (including any attachments) is intended for the sole use of
> the intended recipient/s and may contain material that is CONFIDENTIAL
> AND PRIVATE COMPANY INFORMATION. Any review or reliance by others or
> copying or distribution or forwarding of any or all of the contents in
> this message is STRICTLY PROHIBITED. If you are not the intended
> recipient, please contact the sender by email and delete all copies;
> your cooperation in this regard is appreciated.
> 
> **************************************************************************
> 



More information about the gridengine-users mailing list