[GE users] worth a wiki entry for SGE with OpenMPI and Infiniband

John Leidel john.leidel at gmail.com
Sun Jul 20 23:39:50 BST 2008


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

I second Joe's motion.  I've done this for quite some time manually by
creating a set of startup/pre/post wrapper scripts such that...

for a in `ls $SGE_ROOT/scripts/pre/`; do
    exec $a
done;

....blah blah blah


cheers
john

On Sun, Jul 20, 2008 at 9:43 AM, Joe Landman
<landman at scalableinformatics.com> wrote:
> Hi folks
>
>  On a related note, for this same cluster, we were using infiniband. One of
> the issues with OpenMPI and SGE is that the maximum locked memory (on linux)
> is set way too low for Infiniband, and it can't lock enough memory.  You can
> "fix" this with settings in /etc/security/limits.conf, simply add these two
> lines to the file
>
>        *               soft    memlock unlimited
>        *               hard    memlock unlimited
>
> However, it appears that this works for running OpenMPI over Infiniband apps
> by hand, but not through SGE.  I found that I needed to insert an
>
>        ulimit -l unlimited
>
> in the SGE execd run script, right near the top, or
>
>        qrsh ulimit -l
>
> would always return 32 (kilobytes), and the Infiniband based job wouldn't
> run.
>
> I would like to suggest including a line like this in your execd startup
> script.
>
> For the SGE developers, if you could include an environment
> startup/scripting/tweaking section right before you fire off the main
> sgeexecd process, this could help with other (future) issues like this.
>  Might be worth creating an $SGE/execd_environment directory to contain the
> scripts/settings we need.
>
> Just a thought.
>
> Joe
>
> --
> Joseph Landman, Ph.D
> Founder and CEO
> Scalable Informatics LLC,
> email: landman at scalableinformatics.com
> web  : http://www.scalableinformatics.com
>       http://jackrabbit.scalableinformatics.com
> phone: +1 734 786 8423
> fax  : +1 866 888 3112
> cell : +1 734 612 4615
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list