[GE users] worth a wiki entry for SGE with OpenMPI and Infiniband

Joe Landman landman at scalableinformatics.com
Sun Jul 20 17:43:29 BST 2008


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi folks

   On a related note, for this same cluster, we were using infiniband. 
One of the issues with OpenMPI and SGE is that the maximum locked memory 
(on linux) is set way too low for Infiniband, and it can't lock enough 
memory.  You can "fix" this with settings in /etc/security/limits.conf, 
simply add these two lines to the file

	*		soft	memlock	unlimited
	*		hard	memlock	unlimited

However, it appears that this works for running OpenMPI over Infiniband 
apps by hand, but not through SGE.  I found that I needed to insert an

	ulimit -l unlimited

in the SGE execd run script, right near the top, or

	qrsh ulimit -l

would always return 32 (kilobytes), and the Infiniband based job 
wouldn't run.

I would like to suggest including a line like this in your execd startup 
script.

For the SGE developers, if you could include an environment 
startup/scripting/tweaking section right before you fire off the main 
sgeexecd process, this could help with other (future) issues like this. 
  Might be worth creating an $SGE/execd_environment directory to contain 
the scripts/settings we need.

Just a thought.

Joe

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
        http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list