[GE users] SGE Array Task Question.

templedf dan.templeton at sun.com
Thu Jan 28 22:55:38 GMT 2010


It would technically be possible, but it would require some 
configuration changes.  You'd basically want to create a queue for these 
jobs that only has one slot per host.  You can then submit the array job 
against that queue, and as long as the number of tasks is <= the number 
of hosts, you will only get one per host.  You could probably actually 
allow the queue to contain all hosts and then submit the array job with 
a resource request for the desired host list, e.g. -l 
h=host1|host5|host11|host17.  That way your job only runs on the 
selected hosts, and the queue guarantees that you only get one per 
selected host.

The downside to the technique is that it could be easily confounded.  If 
for some reason one of the hosts isn't available, then that host's job 
would end up on another host.

An alternative approach might be to submit the array job as a parallel 
job instead.  It would require writing a master task to distribute your 
slave tasks to the nodes via qrsh -inherit, but it would give you 
control over how many processes get run per host.

Daniel

On 01/28/10 14:39, wlee_hess wrote:
> Hi group,
>
> I've got a developer I'm working with who uses SGE and the DRMAA interface.  Her problem and questions are below.  We would like to know if what she would like to do is possible with SGE?  We're running both 6.1u4 and 6.2u5 con-currently at this time. Looking at the way SGE works, I don't think we can do this, but if there is a way to do this, then that would be great.
>
> ==============================================
>
> PROBLEM: m-nodes need to run exactly 1 copy of script file S. I would like to know IF it is possible and if so HOW to submit an sge array job that meets the following requirements. Given a range of nodes (node list), a work directory, and a script file S, run exactly 1 copy (and no more) of the script file on each and every node in my node list. If I encounter errors, I will return the appropriate exit code (i.e. 100) to allow the array task to be requeued.
>
> I know how to individually submit to the SGE for each and every node in a range of nodes, but this takes too long. I also know how to run an array job, but I ONLY want ONE copy of my script file to run. Is it possible to do this?
>
> ==============================================
>
> Thanks in advance.
>
> Regards.
>
> Wayne Lee
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=241606
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=241607

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list