[GE users] suspension under MPICH2 tight integration

Jason Crane jasonc at mrsc.ucsf.edu
Thu May 18 16:33:32 BST 2006


On Thu, 2006-05-18 at 15:08 +0200, Reuti wrote:

> > I don't know the specific MPICH2 job suspension requirements just yet.
> > However, the trouble is that I would like to be able to suspend an MPI
> > job on a subordinate queue if a batch job on a higher priority  
> > queue is
> > submitted, but the batch job may be running on an arbitrary node, not
> > necessarily the master node for the subordinate PE job.  In this  
> > case if
> 
> Instead of suspending the complete job, maybe it's easier to submit  
> the MPI jobs to a queue which have a queue priority (i.e. nice value)  
> of 19, and other jobs to queue with a priority of 0. So the MPI job  
> will get less computing time, and you don't have worry about any  
> suspension mechanism at all.
Hi Reuti,
That's essentially what we're doing right now for just this reason (supporting 
MPI jobs). The trouble is that we already require several layers of queue 
priorities and so the difference between nice values isn't always ideal.  For 
example, in the worst case we have queues on desktop machines which are niced, 
relative to interactive use, then we have a higher priority queue, and ideally 
we would like to be able to share the queue with other lower priority user groups 
when appropriate:  
desktop_interactive_use:high_priority.q:std_priority.q:low_priority_user_group.q.
In the best case on our cluster machines we would eliminate the extra
layer required for desktop users. Suspension of subordinates seems like
a much cleaner solution if possible and I was optimistic about the
possibility of this under MPICH2.
thanks, -Jason



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list