[GE users] Intel MPI 3.1 tight integration

Reuti reuti at staff.uni-marburg.de
Tue Nov 4 16:19:55 GMT 2008


Hi Daniel,

Am 03.11.2008 um 15:42 schrieb Daniel Templeton:

> The mpd daemons do daemonize.  That means that the qrsh -inherit  
> returns before the actual work gets done, but shouldn't SGE pick up  
> the usage anyway using the GID?

you are right, the GID is indeed still attached to the mpd and when  
ENABLE_ADDGRP_KILL is set, also the daemon is gone after a job.

But there are some limitations:

a) besides ru_wallclock, all ru_* entries in the accounting records  
are missing, as the process was detached from the shepherd.

b) having two jobs of the same user on a node, the second job will  
kill the first mpd.py instances but leaving the binaries running when  
you start a mpd.py by mpdboot per job.

There is only one entry in /tmp for each user. E.g. in LAM/MPI they  
added special strings containing "sge" and the $JOB_ID to have  
dedicated directories per job, when they discover that they are  
running under SGE and need a daemon per job.

c) As a result of b), the first started job can only have one mpirun/ 
mpiexec, as the mpd.py for this one is gone. Further tasks would be  
created as kids of the mpd.py of the second job, giving wrong  
accounting. Furthermore, the ending 2nd job will also remove the task  
of the second step of the first job.

What would be necessary, would be a dedicated port per mpd.py to  
connect to the right mpd.py of this job, and the mpd.py not forking  
into daemon land.

For me, these are still too many limitations to include this startup  
method in the Howto. I can try to ask the MPICH(2) team and Intel,  
whether they could supply any solution.

-- Reuti


>   I readily admit that SGE PEs are not my strong suit.
>
> There is a switch to make the mpd daemons not daemonize, but then  
> you have to do some dancing around how to let mpdboot run multiple  
> qrsh -inherit calls in the background and still be able to read the  
> first line of input from them (the port number) without having  
> input buffering get in the way.
>
> Daniel
>
> Reuti wrote:
>> Am 03.11.2008 um 14:54 schrieb Daniel Templeton:
>>
>>> Actually, I've done a tight integration, and it's pretty easy.   
>>> The mpdboot command takes a -r parameter that gives the name of  
>>> the "rsh" to execute.  Just create a script that strips out the - 
>>> x and -n from the arguments and runs qrsh -inherit instead of  
>>> rsh, and pass that script to mpdboot with -r.  (You may also want  
>>> to shortcut out the Python version check...)  You'll also need a  
>>> PE starter that creates an appropriate machines file.
>>
>> In contrast to MPICH(2) the mpd daemons are not forking into  
>> daemonland any longer? Besides this, I found the creation of more  
>> and more processgroups by the Python script in MPICH(2) being the  
>> handicap.
>>
>> Is it also working with two jobs of the same user on a node?
>>
>> No shutdown necessary?
>>
>> -- Reuti
>>
>>
>>> My scripts below should work with Intel MPI 3.1 or 3.2.
>>>
>>> Daniel
>>>
>>> % cat startpe.sh
>>> #!/bin/sh
>>>
>>> hfile=$TMP/mpd.hosts
>>> touch $hfile
>>>
>>> cat $PE_HOSTFILE | while read line; do
>>>  host=`echo $line | cut -d' ' -f1 | cut -d'.' -f1`
>>>  cores=`echo $line | cut -d' ' -f2`
>>>
>>>  while [ $cores -gt 0 ]; do
>>>    echo $host >> $hfile
>>>    cores=`expr $cores - 1`
>>>  done
>>> done
>>>
>>> exit 0
>>> % cat qrsh-inherit.pl
>>> #!/usr/bin/perl
>>>
>>> # Shortcircuit python version check
>>> if (grep /^\s*-x\s*$/, @ARGV) {
>>>  print "2.4\n";
>>>  exit 0;
>>> }
>>>
>>> # Strip out -n and -x
>>> @ARGV = grep !/^\s*-[nx]\s*$/, @ARGV;
>>>
>>> exec "qrsh", "-inherit", @ARGV;
>>>
>>>
>>> Daniel De Marco wrote:
>>>> Hi,
>>>>
>>>> I'm trying to integrate Intel MPI with gridengine. From what I  
>>>> found on
>>>> the list archives it seems tight integration is impossible. What  
>>>> about loose intergation, did anyone try it? Any comments/pointers?
>>>>
>>>> Thanks, Daniel.
>>>>
>>>>
>>>> ------------------------------------------------------------------- 
>>>> --
>>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>>> For additional commands, e-mail: users- 
>>>> help at gridengine.sunsource.net
>>>>
>>>>
>>>
>>> -------------------------------------------------------------------- 
>>> -
>>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
>> For additional commands, e-mail: users-help at gridengine.sunsource.net
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list