[GE users] Using more than one qrsh -inherit parallel on the same host

Bogdan Costescu bogdan.costescu at iwr.uni-heidelberg.de
Fri Aug 20 15:07:38 BST 2004


On Fri, 20 Aug 2004, Thomas Neumann wrote:

> I'm just trying to use the grid engine to run jobs with more than one 
> parallel process on each node.

OK.

> qrsh -inherit host1 sleep 60 &
> qrsh -inherit host1 hostname

The number of simultaneous 'qrsh -inherit' to the same host is limited 
to the number of slots allocated for the job on that node. So if SGE 
has allocated 2 slots on host1, the above commands should both 
succeed. If SGE allocated only 1 slot on host1, the first command 
should succeed and the second should fail.

> so that I can't use a shellscipt containing all commands and sending
> it as a single task to the destination host.

This doesn't sound to me like the situation above. What you are saying 
in words, I understand as:

cat << EOF > ~/commands
command1... [&]
command2... [&]
...
EOF
chmod 755 ~/commands
qrsh -inherit host1 ~/commands [&]

(commands and 'qrsh -inherit' can optionally be backgrounded).
This is only one 'qrsh -inherit' which should always succeed as SGE 
allocates at least 1 slot on a host.

> Reading about the error I found that I have to increase the
> gid-range,

The range should be increased only when you plan to run that many
_simultaneous jobs_ on one node, as SGE assigns one additional group
id to each job.

> Is there a way to start processes like this without installing a special 
> client on the destination host ?

It's a very similar problem to the LAM-MPI tight integration. There
the LAM daemon takes care of starting the processes that are part of 
the job; however, the start of the LAM daemon itself requires 2 'qrsh 
-inherit' which fail when SGE only allocated one slot on the node(s).

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list