[GE users] OpenMPI and memory limits

reuti reuti at staff.uni-marburg.de
Tue Nov 2 10:43:25 GMT 2010


Hi,

Am 28.10.2010 um 15:11 schrieb fabiomartinelli:

> my OpenMPI 1.4 + SGE 6.2u3 works at the basic level, I can submit and
> retrieve results, fine.
> 
> now I'd like implement this policy, I have an MPI job requesting 16 cores and
> 16GB of RAM, I created a queue mpi.q.16 with just 1 core in 16 server, which
> h_vmem limit should I enforce on the queue mpi.q.16 or should I do ?
> I don't want that 1 piece of the MPI computing exploits more than 16GB of RAM
> in a server.

you are running your job only on one machine using threads then? Usually you use a PE with "allocation_rule $pe_slots", which will then multiply the memory request and you would need to request "-l h_vmem=1G".   

-- Reuti


> 
> also, during and at the end of the MPI computing, may I retrieve the memory
> that was used server by server ?
> 
> this is the actual conf:
> 
> [root at scigrid ~]# qconf  -sq mpi.q.16
> qname                 mpi.q.16
> hostlist              @infiniband
> seq_no                0
> load_thresholds       np_load_avg=1.75
> suspend_thresholds    NONE
> nsuspend              1
> suspend_interval      00:05:00
> priority              19
> min_cpu_interval      00:05:00
> processors            UNDEFINED
> qtype                 BATCH INTERACTIVE
> ckpt_list             NONE
> pe_list               mpi mpich2_mpd mpich2_smpd_rsh mvapich2 openmp openmpi
> pvm
> rerun                 FALSE
> slots                 1
> tmpdir                /tmp
> shell                 /bin/bash
> prolog                NONE
> epilog                NONE
> shell_start_mode      posix_compliant
> starter_method        NONE
> suspend_method        NONE
> resume_method         NONE
> terminate_method      NONE
> notify                00:00:60
> owner_list            NONE
> user_lists            mpi.q.16
> xuser_lists           NONE
> subordinate_list      NONE
> complex_values        NONE
> projects              NONE
> xprojects             NONE
> calendar              NONE
> initial_state         default
> s_rt                  INFINITY
> h_rt                  1080:00:00
> s_cpu                 INFINITY
> h_cpu                 360:00:00
> s_fsize               INFINITY
> h_fsize               INFINITY
> s_data                INFINITY
> h_data                INFINITY
> s_stack               INFINITY
> h_stack               INFINITY
> s_core                INFINITY
> h_core                INFINITY
> s_rss                 INFINITY
> h_rss                 INFINITY
> s_vmem                INFINITY
> h_vmem                16G
> 
> thanks a lot
> Fabio
> 
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=290813
> 
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=292069

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list