[GE users] Setting memlock limit with SGE 6.2; was: Re: [GE users] worth a wiki entry for SGE with OpenMPI and Infiniband
andy.schwierskott at sun.com
Mon Jul 21 11:17:02 BST 2008
we don't have this like no means to set a default $PATH other than the
compiled in path. The umask is hardcoded to 022.
On Mon, 21 Jul 2008, Erik Soyez wrote:
> Thanks Andy, cool! What about the umask - any similar hack available?
> :-)Erik Soyez.
> On Mon, 21 Jul 2008, Andy Schwierskott wrote:
>> just on a side node regarding the 'memlock' resource limit issue which are
>> reported here sometimes: For SGE 6.2 (it's not part of the Beta and Beta
>> refresh however) we added as a last minute feature the ability to configure
>> the memlock limit and a few others on the execd level, i.e. via the
>> 'execd_params' cluster config setting.
>> Full background: in the SGE queue config you can configure most but not all
>> Unix resource limits like CPU time, max. virtual memory and so on. There
>> a few others like the maximum file descriptor limit which exist on
>> all OS'es and some which just exit on one or a few OS'es (like the
>> limit). It was too late to extend the queue configuration and we found the
>> workaround to configure these limits indirectly by hacking the SGE execd
>> startup scripts (this is the chain how the job inherits such limits if they
>> are not set) to implicit and error prone, therefore we decided to enable an
>> admin to set such limits via the execd_params setting.
>> It's not a 100% perfect solution: it's a execd setting valid for all jobs
>> running in all queues on that host and it does not provide a solution to
>> the configured system wide limits set e.g. in /etc/security/limits.conf on
>> Linux. Nevertheless it's much better than requiring to edit the job scripts
>> or the execd startup scripts which could get overwritten with an update and
>> would not work if for testing purposes the execd is started directly e.g.
>> debug mode.
>> For the interested reader here's an excerpt from the SGE 6.2 sge_conf(5)
>> page which describes the syntax and semantic of these settings:
>> S_DESCRIPTORS, H_DESCRIPTORS, S_MAXPROC, H_MAXPROC,
>> S_MEMORYLOCKED, H_MEMORYLOCKED, S_LOCKS, H_LOCKS
>> Specifies soft and hard resource limits as implemented
>> by the setrlimit(2) system call. See this manual page
>> on your system for more information. These parameters
>> complete the list of limits set by the RESOURCE LIMITS
>> parameter of the queue configuration as described in
>> queue_conf(5). Unlike the resource limits in the queue
>> configuration, these resource limits are set for every
>> job on this execution host. If a value is not speci-
>> fied, the resource limit is inherited from the execu-
>> tion daemon process. Because this would lead to
>> unpredicted results, if only one limit of a resource is
>> set (soft or hard), the corresponding other limit is
>> set to the same value.
>> S_DESCRIPTORS and H_DESCRIPTORS specify a value one
>> greater than the maximum file descriptor number that
>> can be opened by any process of a job.
>> S_MAXPROC and H_MAXPROC specify the maximum number of
>> processes that can be created by the job user on this
>> execution host
>> S_MEMORYLOCKED and H_MEMORYLOCKED specify the maximum
>> number of bytes of virtual memory that may be locked
>> into RAM.
>> S_LOCKS and H_LOCKS specify the maximum number of file
>> locks any process of a job may establish.
>> All of these values can be specified using the multi-
>> plier letters k, K, m, M, g and G, see sge_types(1) for
>> So you would simply set
>> execd_params H_MEMORYLOCKED=unlimited
>> to set the soft and hard Linux "memlock" limit to unlimited.
>> On OS'es which do not support one of these limits the setting will be
>> silently ignored.
>> There's still a gotcha: If you would use the old interactive job support
>> not the default builtin new one (qrsh without command which calls the
>> rlogind), qlogin which uses the system telnetd and likley ssh(d)) the SGE
>> setting owuld get overridden since those daemons adhere to the
>> /etc/security/limits.conf on Linux. They are started after the shepherd
>> those limits.
>> For SGE 6.1 and earlier the best workaround in my opinion is to set those
>> limits in the execd startup script. At least this eliminates a different
>> behavior if the execd is started at system boot time or later by an
>> interactively logged in root user. As stated above care has to be taken
>> the execd startup script is changed, a new execd is installed or the execd
>> is started directly without using the startup script.
>> On Sun, 20 Jul 2008, John Leidel wrote:
>>> I second Joe's motion. I've done this for quite some time manually by
>>> creating a set of startup/pre/post wrapper scripts such that...
>>> for a in `ls $SGE_ROOT/scripts/pre/`; do
>>> exec $a
>>> ....blah blah blah
>>> On Sun, Jul 20, 2008 at 9:43 AM, Joe Landman
>>> <landman at scalableinformatics.com> wrote:
>>>> Hi folks
>>>> On a related note, for this same cluster, we were using infiniband. One
>>>> the issues with OpenMPI and SGE is that the maximum locked memory (on
>>>> is set way too low for Infiniband, and it can't lock enough memory. You
>>>> "fix" this with settings in /etc/security/limits.conf, simply add these
>>>> lines to the file
>>>> * soft memlock unlimited
>>>> * hard memlock unlimited
>>>> However, it appears that this works for running OpenMPI over Infiniband
>>>> by hand, but not through SGE. I found that I needed to insert an
>>>> ulimit -l unlimited
>>>> in the SGE execd run script, right near the top, or
>>>> qrsh ulimit -l
>>>> would always return 32 (kilobytes), and the Infiniband based job wouldn't
>>>> I would like to suggest including a line like this in your execd startup
>>>> For the SGE developers, if you could include an environment
>>>> startup/scripting/tweaking section right before you fire off the main
>>>> sgeexecd process, this could help with other (future) issues like this.
>>>> Might be worth creating an $SGE/execd_environment directory to contain
>>>> scripts/settings we need.
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net
More information about the gridengine-users