[GE users] how to use Python multiprocessing script with SGE

cguilbert69 cguilbert69 at gmail.com
Sat May 15 02:17:00 BST 2010

    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi Reuti,
Thanks for the answer

On Fri, May 14, 2010 at 5:13 PM, reuti <reuti at staff.uni-marburg.de<mailto:reuti at staff.uni-marburg.de>> wrote:

Am 14.05.2010 um 20:59 schrieb cguilbert69:

> Thanks Sean for the answer,
> I don't really want to use mpi for python (yet) but python
> multiprocessing.
> The real question for me is how you pass the CPU you reserved with
> SGE queuing to python multiprocessing
> For instance if I run the python script:
> import multiprocessing as mp
> print mp.cpu_count()
> I will have 4 because its a quad core.
> Now if I run the same script inside a qsub script with the option:
> #$ -pe mpi 10

usually MPI is setup to run between nodes. A PE smp with allocation
rule $pe_slots might be better suited for Python's threads.

How do you do that , do you have any example ?

> I still have 4 and not 10 as I request in my qsub script.

Having 10 slots on 4 cores won't improve speed. But anyway: is
mp.cpu_count() used to start processes? If it's just a limit which you
have to access when starting the threads in a loop, you can access the
environment variable $NSLOTS inside Python instead. It's set up by SGE
with the number of granted slots.

Sorry I am not clear, I am not a hard core programmer myself and probably use the wrong terms . When I use python multiprocessing script alone, python auto detect how many cores my CPU gots , it this case I have a quad-core so python can distribute my jobs on the 4 cores using python multiprocessing

Maybe I am using the wrong syntax to reserve 10 cores with SGE, but when I said "#$ -pe mpi 10" , I mean that I want to reserve 10 cores (like if you do mpirun -n 10),  so SGE queue would for example reserved 3 computers clients (2 quad-core and 1 dual core = 10 cores).
My python multiprocessing script will distribute all the jobs on the 10 cores define by SGE qsub script.
My program is not doing strictly parallel calculation , all I want to do is to send a serie  of jobs on different core  in parallel from my main python script.

When I am running my python script directly (with no SGE queuing ), python detect and use the 4 cores of my quad-core (1CPU), I want to do the same with  20, 100, ... cores, I want python multiprocessing to see a gigantic 100 cores CPU , those cores being defined by the SGE queuing system. they must be a mechanism to trick python and pretend that it is using a 100 cores on 1 CPU.

Here is the idea of the stuff I want to do using  ssh  in a cluster of 100 nodes  (1 core per nodes)
I run a python script which will send 100 jobs in parallel using ssh on the 100 nodes , when all the calculation will be done , the main python script will continue and may send some other jobs to the nodes ....
So instead of using ssh to submit my jobs , I want to use pool() or process() from the multiprocessing python library . Somehow python need to know how to use the CPUs (or cores) reserved by the SGE qsub right ?  How ?

If anyone can give me some script to do that it would be really nice.




To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net<mailto:users-unsubscribe at gridengine.sunsource.net>].

More information about the gridengine-users mailing list