[GE users] nice value, vitual memory limit and cpu time limit not taken into account

fboucher Florent.Boucher at cnrs-imn.fr
Mon Mar 23 07:38:41 GMT 2009


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

mhanby a écrit :
> Could you copy and paste your two queue configurations (using long.q and
> short.q as examples since I don't know your queue names):
>
> qconf -sq long.q
> qconf -sq short.q
>   
Please find below the output for the two queues:
> qname                 long.q
> hostlist              @allhosts
> seq_no                0
> load_thresholds       np_load_avg=1.75
> suspend_thresholds    NONE
> nsuspend              1
> suspend_interval      00:01:00
> priority              15
> min_cpu_interval      00:05:00
> processors            UNDEFINED
> qtype                 BATCH INTERACTIVE
> ckpt_list             NONE
> pe_list               bigmem hugemem make mpich mpich-1 mpich-2 openmp
> rerun                 FALSE
> slots                 
> 1,[clustus.cluster=0],[n001.cluster=4],[n002.cluster=4], \
>                       
> [n003.cluster=4],[n004.cluster=4],[n018.cluster=4], \
>                       
> [n016.cluster=4],[n009.cluster=4],[n005.cluster=4], \
>                       
> [n012.cluster=4],[n020.cluster=4],[n019.cluster=4], \
>                       
> [n007.cluster=4],[n008.cluster=4],[n011.cluster=4], \
>                       
> [n015.cluster=4],[n010.cluster=4],[n013.cluster=4], \
>                       [n014.cluster=4],[n017.cluster=4],[n006.cluster=4]
> tmpdir                /tmp
> shell                 /bin/bash
> prolog                NONE
> epilog                NONE
> shell_start_mode      unix_behavior,[clustus.cluster=unix_behavior], \
>                       [n001.cluster=unix_behavior], \
>                       [n002.cluster=unix_behavior], \
>                       [n003.cluster=unix_behavior], \
>                       [n004.cluster=unix_behavior], \
>                       [n018.cluster=unix_behavior], \
>                       [n016.cluster=unix_behavior], \
>                       [n009.cluster=unix_behavior], \
>                       [n005.cluster=unix_behavior], \
>                       [n012.cluster=unix_behavior], \
>                       [n020.cluster=unix_behavior], \
>                       [n019.cluster=unix_behavior], \
>                       [n007.cluster=unix_behavior], \
>                       [n008.cluster=unix_behavior], \
>                       [n011.cluster=unix_behavior], \
>                       [n015.cluster=unix_behavior], \
>                       [n010.cluster=unix_behavior], \
>                       [n013.cluster=unix_behavior], \
>                       [n014.cluster=unix_behavior], \
>                       
> [n017.cluster=unix_behavior],[n006.cluster=unix_behavior]
> starter_method        NONE
> suspend_method        SIGTSTP
> resume_method         NONE
> terminate_method      NONE
> notify                00:00:60
> owner_list            NONE
> user_lists            NONE
> xuser_lists           NONE
> subordinate_list      verylong.q
> complex_values        NONE
> projects              NONE
> xprojects             NONE
> calendar              NONE
> initial_state         default
> s_rt                  INFINITY
> h_rt                  INFINITY
> s_cpu                 INFINITY
> h_cpu                 72:00:00
> s_fsize               INFINITY
> h_fsize               INFINITY
> s_data                INFINITY
> h_data                INFINITY
> s_stack               INFINITY
> h_stack               INFINITY
> s_core                INFINITY
> h_core                INFINITY
> s_rss                 INFINITY
> h_rss                 INFINITY
> s_vmem                INFINITY
> h_vmem                3072.00M
> qname                 short.q
> hostlist              @allhosts
> seq_no                2
> load_thresholds       np_load_avg=1.75
> suspend_thresholds    NONE
> nsuspend              1
> suspend_interval      00:01:00
> priority              0
> min_cpu_interval      00:05:00
> processors            UNDEFINED
> qtype                 BATCH INTERACTIVE
> ckpt_list             NONE
> pe_list               mpich-1
> rerun                 FALSE
> slots                 
> 1,[clustus.cluster=0],[n001.cluster=1],[n002.cluster=1], \
>                       
> [n003.cluster=1],[n004.cluster=1],[n018.cluster=1], \
>                       
> [n016.cluster=1],[n009.cluster=1],[n005.cluster=1], \
>                       
> [n012.cluster=1],[n020.cluster=1],[n019.cluster=1], \
>                       
> [n007.cluster=1],[n008.cluster=1],[n011.cluster=1], \
>                       
> [n015.cluster=1],[n010.cluster=1],[n013.cluster=1], \
>                       [n014.cluster=1],[n017.cluster=1],[n006.cluster=1]
> tmpdir                /tmp
> shell                 /bin/bash
> prolog                NONE
> epilog                NONE
> shell_start_mode      unix_behavior,[clustus.cluster=unix_behavior], \
>                       [n001.cluster=unix_behavior], \
>                       [n002.cluster=unix_behavior], \
>                       [n003.cluster=unix_behavior], \
>                       [n004.cluster=unix_behavior], \
>                       [n018.cluster=unix_behavior], \
>                       [n016.cluster=unix_behavior], \
>                       [n009.cluster=unix_behavior], \
>                       [n005.cluster=unix_behavior], \
>                       [n012.cluster=unix_behavior], \
>                       [n020.cluster=unix_behavior], \
>                       [n019.cluster=unix_behavior], \
>                       [n007.cluster=unix_behavior], \
>                       [n008.cluster=unix_behavior], \
>                       [n011.cluster=unix_behavior], \
>                       [n015.cluster=unix_behavior], \
>                       [n010.cluster=unix_behavior], \
>                       [n013.cluster=unix_behavior], \
>                       [n014.cluster=unix_behavior], \
>                       
> [n017.cluster=unix_behavior],[n006.cluster=unix_behavior]
> starter_method        NONE
> suspend_method        SIGTSTP
> resume_method         NONE
> terminate_method      NONE
> notify                00:00:60
> owner_list            NONE
> user_lists            NONE
> xuser_lists           NONE
> subordinate_list      NONE
> complex_values        NONE
> projects              NONE
> xprojects             NONE
> calendar              NONE
> initial_state         default
> s_rt                  INFINITY
> h_rt                  INFINITY
> s_cpu                 INFINITY
> h_cpu                 01:00:00
> s_fsize               INFINITY
> h_fsize               INFINITY
> s_data                INFINITY
> h_data                INFINITY
> s_stack               INFINITY
> h_stack               INFINITY
> s_core                INFINITY
> h_core                INFINITY
> s_rss                 INFINITY
> h_rss                 INFINITY
> s_vmem                INFINITY
> h_vmem                512M

> -----Original Message-----
> From: fboucher [mailto:Florent.Boucher at cnrs-imn.fr] 
> Sent: Friday, March 20, 2009 1:18 PM
> To: users at gridengine.sunsource.net
> Subject: [GE users] nice value, vitual memory limit and cpu time limit
> not taken into account
>
> Dear SGE users,
> following the suggestion of Reuti to use queues with higher priority
> instead of subordinated ones, I have created two types of queues.
> One is for long job : no cpu limit, can run on all the cluster, nice
> value is high (low priority), vmlimit is 3Go
>
> The other is for short jobs : 1h00 cpu limit, only 12 slot on all the
> cluster, nice is 0 (highest priority), vmlimit is 500Mo
>
> This short queue can run together with the long one even if the cluster
> is fully loaded (I accept maxmimum 5 process for 4 cores for each node).
>
> SGE version is 6.1u3 (not yet updated)
>
> Actually, if I submit parallel mpi job, the memory limit, nice values,
> and cpu limit are completely ignored.
> On serial jobs (cpu and memory limit are effective).
>
> Any idea where it could come from ?
> Regards
> Florent
>
> --
>  
> ------------------------------------------------------------------------
> -
> | Florent BOUCHER                    |
> |
> | Institut des Matériaux Jean Rouxel |
> Mailto:Florent.Boucher at cnrs-imn.fr |
> | 2, rue de la Houssini?re           | Phone: (33) 2 40 37 39 24
> |
> | BP 32229                           | Fax:   (33) 2 40 37 39 95
> |
> | 44322 NANTES CEDEX 3 (FRANCE)      | http://www.cnrs-imn.fr
> |
>  
> ------------------------------------------------------------------------
> -
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessage
> Id=137862
>
> To unsubscribe from this discussion, e-mail:
> [users-unsubscribe at gridengine.sunsource.net].
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=137886
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>   


-- 
 -------------------------------------------------------------------------
| Florent BOUCHER                    |                                    |
| Institut des Matériaux Jean Rouxel | Mailto:Florent.Boucher at cnrs-imn.fr |
| 2, rue de la Houssini?re           | Phone: (33) 2 40 37 39 24          |
| BP 32229                           | Fax:   (33) 2 40 37 39 95          |
| 44322 NANTES CEDEX 3 (FRANCE)      | http://www.cnrs-imn.fr             |
 -------------------------------------------------------------------------

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=140200

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

    [ Part 2, "Florent_Boucher.vcf"  Text/X-VCARD (Name: ]
    [ "Florent_Boucher.vcf") ~475 bytes. ]
    [ Unable to print this part. ]



More information about the gridengine-users mailing list