[GE users] OpenMPI RLIMIT_MEMLOCK problem

Prentice Bisbal prentice at ias.edu
Wed Dec 3 22:49:53 GMT 2008


    [ The following text is in the "UTF-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

I think I found a bug in 6.2. When I set

execd_params H_MEMORYLOCKED=unlimited

and restart sge_execd on all my nodes, I get this:

$ qrsh ulimit -l
4

When I set

execd_params H_MEMORYLOCKED=UNLIMITED,

I get this

$ qrsh ulimit -l
5

If set H_MEMORYLOCKED to a numerical value, like this

execd_params H_MEMORYLOCKED=32M

I get the correct result:

qrsh ulimit -l
32768

Looks like the unlimited keyword is not defined properly in the code
somewhere. Also, I have to restart sge_execd to get these changes to
take effect. Is that the proper behavior? The documentation I linked to
in my original post didn't explicitly say anything about restarting
sge_execd after tinkering with these settings.

--
Prentice



Prentice Bisbal wrote:
> I posted this earlier, but it never showed up on the list. Probably b/c
> I unsubscribed before SC08 and forget to resubscribe. If if the first
> attempt shows up, I apologize for posting twice.
> 
> I'm using SGE 6.2 with OpenMPI 1.2.8.
> 
> I just setup OpenMPI tight integration following the instructions here:
> 
> http://www.open-mpi.?org/faq/?category=ru?nning#run-n1ge-or-sg?e
> 
> I then defined my exec_params to set H_MEMLOCK=unlimited as described here:
> 
> http://gridengine.su?nsource.net/ds/viewM?essage.do?dsForumId=?38?
> &dsMessageId=72405
> 
> Unfortunately, when I submit an MPI job, I still get MEMLOCK errors:
> 
> libibverbs: Warning: RLIMIT_MEMLOCK is 11162 bytes.
>     This will severely limit memory registrations.
> libibverbs: Warning: RLIMIT_MEMLOCK is 11199 bytes.
>     This will severely limit memory registrations.
> libibverbs: Warning: RLIMIT_MEMLOCK is 11181 bytes.
>     This will severely limit memory registrations.
> 
> To make sure my parameters were took effect, I even stopped and
> restarted sge_execd on every compute node with no luck.
> 
> Any ideas? Relevant configuration information is below. Please note that
> in qconf I've tried
> 
> 1. execd_params H_MEMORYLOCKED=ulimited
> 2. execd_params S_MEMORYLOCKED=ulimited
> 3. execd_params S_MEMORYLOCKED=ulimited H_MEMORYLOCKED=ulimited
> 
> All 3 had the same result.
> 
> Let me know if additional/complete config information would be helpful.
> I don't want to flood the list with unnecessary config information.
> 
> # qconf -sp orte
> pe_name orte
> slots 512
> user_lists NONE
> xuser_lists NONE
> start_proc_args /bin/true
> stop_proc_args /bin/true
> allocation_rule $fill_up
> control_slaves TRUE
> job_is_first_task FALSE
> urgency_slots min
> accounting_summary FALSE
> 
> # qconf -sq all.q | grep pe_list
> pe_list make orte
> 
> # qconf -sconf | grep execd_params
> execd_params H_MEMORYLOCKED=unlimited
> S_MEMORYLOCKED=unlimited
> 
> $ more xhpl.sh
> #$ -S /bin/bash
> #$ -N xhpl
> #$ -pe orte 512
> #$ -cwd
> #$ -V
> 
> MPI=/usr/local/openm?pi/gcc-4.1.2/x86_64/?
> PATH=${MPI}/bin:$i{PATH}
> LD_LIBRARY_PATH=${MPI}/lib
> 
> mpirun ./xhpl
> 


-- 
Prentice

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=90969

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list