[GE users] Question regarding Parallel Environment

Kogan, Felix Felix-Kogan at deshaw.com
Fri Aug 24 16:47:50 BST 2007

Hi, all,

I wonder, is it possible somehow to change the node allocation for the
parallel jobs on the fly? I.e., initial user requested 10 slots at the
parallel job submission but then wants to add 3 more slots to the pool?

More general question: 

We have sets of production jobs that need to run on multiple nodes and
are parallel in a sense. That is, they must start and stop together and
if one of jobs can't be submitted, the rest cannot run. I'm thinking
about using a sort of semi-tight PE integration (something like mpi or
pvm integration methods provided with SGE installation): parallel job
submission allocates set of slots and then custom program spawns
necessary subprocesses to the allocated slots using "qrsh -inherit". 

Is there any other, more suitable way that you could suggest to reach
the same goal? Obvious thing is to create the dedicated queues for each
such job set, but this causes inefficient resource usage, as many nodes
remain unused. Better way would be to have a contiguous pool of
production nodes from which all production sets can tap, but we need to
ensure consistent allocation and lack of interference between the sets
(i.e. if one set allocated slots on a node, another set can't use the
same node). Of course, proper sizing of the pool would be necessary, but
this is a solvable problem...



To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net

More information about the gridengine-users mailing list