[GE users] SGE resources and job queues.

Reuti reuti at staff.uni-marburg.de
Wed May 11 22:41:07 BST 2005


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi again Jon,

requesting one specific rack is not a good idea IMO, as this will bind the jobs 
just to this rack and is very unflexible. One idea in such a situation is to 
define one PE for each rack, and one queue with this PE attached for each rack. 
Then you can request with a wildcard "-pe rack* 32" (or a name reflecting the 
purpose mpi/pvm/...). Once a PE is selected, the queue/nodes are all coming 
from one rack. The is a small annoying bug, but the complete setup and 
workaround you can find in issue 1597. Despite this, it seems working as it 
should.

Chris already mentioned the possibility of a resource reservations and 
checkpointing.

Cheers - Reuti


Quoting Jon Savian <worknit at gmail.com>:

> Hi Reuti,
> 
> Thanks for your prompt response.  Users usually run scientific
> programs and request whatever resources they need for the job.  So
> yes, they specify runtime, memory, and number of slots needed.
> 
> Users have expressed interest in running larger jobs that require 32
> nodes, containing 2 slots, and 2GB of memroy each.  However they would
> like jobs to be run on nodes contained in the same rack, instead of
> using nodes accross multiple racks.  We have multiple racks of 32
> nodes.  Hard requests will be needed i belive.
> 
> So the first step i took was to specify a resource for one of the 32
> node racks.  So when a user does a "qsub -l resource_name....." It
> will run under the 32 nodes specified by it.  However other users
> might have already submitted jobs that are queued to run on some of
> the nodes we will need for our larger 32 node single rack job.  So
> ideally, i think we would want to find a way to make the the the
> single rack available so that the larger 32 node single rack job can
> run ASAP, which means moving the other jobs that users submitted to
> other nodes.  This may happen on a usual basis, so any kind of
> permanent setting for this would be great.
> 
> I should also mention that I am making all modifications via qmon.
> 
> Thanks.
> 
> Jon
>   
> 
> They will be running a job on 32 nodes, each having 2GB memory, 2 slots/node.
>   
> 
> On 5/11/05, Reuti <reuti at staff.uni-marburg.de> wrote:
> > Hi Jon,
> > 
> > can you give more details: what exactly do you mean with small and large
> jobs?
> > The runtime, the memory request, the number of slots?
> > 
> > And: is resource2 a hard request for the small jobs?
> > 
> > Anyway: Two possibilities to look at are soft-requests (for resource1 for
> the
> > small jobs), or putting a sequence number on the nodes, so that resource1
> nodes
> > are filled first.
> > 
> > Cheers - Reuti
> > 
> > 
> > Quoting Jon Savian <worknit at gmail.com>:
> > 
> > > Hi Everyone,
> > >
> > > I am trying to allocate resources on a cluster, so i followed the
> > > steps here:
> > > http://gridengine.sunsource.net/project/gridengine/howto/resource.html.
> > >  Lets say i created two resources, we'll call them resource1 and
> > >  resource2.  I want to be able to run large job using resource2, but if
> > > there are a lot of smaller jobs queued to run on resource2 then the
> > > larger job will have to wait until the smaller ones execute.  Is there
> > > any way to move smaller jobs from the nodes on resource2 and put them
> > > on resource1 (or any other non-resource2 nodes for that matter) so
> > > that the larger job may run on resource2 ASAP?  Or even better, are
> > > there any priorities that can be set with the larger job that will put
> > > it before the smaller ones?
> > >
> > > Thanks.
> > >
> > > Jon
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > > For additional commands, e-mail: users-help at gridengine.sunsource.net
> > >
> > >
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> > 
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list