[GE users] exclusive queues - mutually subordinating

Shannon V. Davidson svdavidson at charter.net
Thu Nov 3 19:08:18 GMT 2005


    [ The following text is in the "ISO-8859-15" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

James,

I like the oxymoron "mutual subordination", but it makes me think that 
this may not be the right approach to solve the problem. :-)

If you are trying to run either 2 serial jobs or 1 parallel job per 
host, then read on. Otherwise, I misunderstood your intentions.

If a single MPI task needs exclusive access to each dual CPU node, I 
suggest you request 2 PE slots per node but only start a single MPI task 
per node. This can be handled in the PE startup script (e.g. startmpi.sh 
-unique). That way, the host and queue slot mechanisms can be still be 
used.  For instance, the following PE could be used.

pe_name          dedicated
slots            128
user_lists       NONE
xuser_lists      NONE
start_proc_args  <your_sge_root>/mpi/startmpi.sh -unique $pe_hostfile
stop_proc_args   <your_sge_root>/mpi/stopmpi.sh
allocation_rule  2
control_slaves   TRUE
job_is_first_task FALSE
urgency_slots     min

This should work with a single queue (all.q) for both parallel and 
serial jobs.  If you are using both a serial and parallel queue, you'll 
need to set slots=2 consumable at the host level.

I have used this for scheduling hybrid MPI+OpenMP jobs. 

Shannon

James Coomer wrote:

>The parallel jobs go via a pe (which uses SCore MPI) which is associated
>with parallel.q
>
>James
>
>  
>
>>James,
>>
>>How are you using the parallel queues as part of a Grid Engine parallel
>>environment (PE)?  Or are you just running jobs which need 2 cpus in the
>>parallel queues?
>>
>>Shannon
>>
>>James Coomer wrote:
>>
>>    
>>
>>>Hi,
>>>
>>>I'm using SGE6u3 on SuSe9.0 cluster
>>>
>>>We are using queues in a slightly unusual way, where we have a parallel.q
>>>and serial.q spanning all queues. Each queue has the other in the
>>>subordinate_list so that parallel and serial jobs don't mix on a host.
>>>
>>>I can't ( I dont think) force the exclusivity by using a per-host slot
>>>consumable because (for good reasons) we have 1 slot per host for the
>>>parallel queues and 2 for the serial queues (they are 2cpu machines)
>>>
>>>My problem is that under certain circumstances both serial and parallel
>>>jobs can get scheduled in the same scheduling cycle and occupy both
>>>queues
>>>simultaneously, and suspend eachother.
>>>
>>>I've found similar queries in the archives, but nothing quite the same.
>>>Any ideas?
>>>
>>>Many Thanks,
>>>James
>>>
>>>
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>>
>>>
>>>      
>>>
>>--
>>____________________________________________
>>
>>Shannon V. Davidson <svdavidson at charter.net>
>>Senior Software Engineer            Raytheon
>>636-479-7465 office         443-383-0331 fax
>>____________________________________________
>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>>
>>    
>>
>
>
>  
>


-- 
____________________________________________

Shannon V. Davidson <svdavidson at charter.net>
Senior Software Engineer            Raytheon
636-479-7465 office         443-383-0331 fax
____________________________________________





More information about the gridengine-users mailing list