[GE users] Queue priority strangeness.

udowaechter udo.waechter at uni-osnabrueck.de
Tue Aug 18 12:56:47 BST 2009


The only difference between the queues is:

ikw_longrun: priority 10
ikw:	     priosrity 7

And:
ikw_longrun consists of a subset of machines. those machines in 
ikw_longrun are also in ikw.

Anyway, we have more queues that have a similar setup (subsets of 
machines of other queues).

Nice values work on all except the ikw_longrun.

Our users do not have shell access to theses machines, they are used by 
the gridengine exclusively. None of the scripts does renicing, we have 
checked that.

Thanks,
udo.

On 08/18/2009 11:56 AM, adary wrote:
> Is there a difference between the long job and a regular job?
>
> Basically, its very easy to write a script that will find the process ID of your job and renice it to 0 (cant go lower if you are not root)
>
> We had issues with our users being clever and doing a similar thing and the only solution was to yell^H^H^H^Htalk to them and ask nicely not to do something similar.
>
>
>
> -----Original Message-----
> From: udowaechter [mailto:udo.waechter at uni-osnabrueck.de]
> Sent: Tuesday, August 18, 2009 12:51 PM
> To: users at gridengine.sunsource.net
> Subject: [GE users] Queue priority strangeness.
>
> Hello,
> I have a strange problem with one of our queues on GE 6.2u3.
>
> We have recently defined a queue containing a subset of our machines
> that are in the main queue. This should be the longrun queue containing
> those machines that are guaranteed to run for long times.
> Anyway, the problem is, that the jobs in this queue all run wiht "nice
> 0" although  it should have "nice 10"
>
> All other queues' priority is honored.
>
> How could I further debug this problem? Did anyone else experience this
> problem?
>
>
> Thanks,
> udo.
>
> Here is the config of the two queues:
>
> 1st, working priorities.
>
>
> qname                 ikw
> hostlist              @allhosts_ikw-slots_1 @allhosts_ikw-slots_2 \
>                         @allhosts_ikw-slots_4 @allhosts_ikw-slots_8
> seq_no                0
> load_thresholds       np_load_avg=1.75
> suspend_thresholds    NONE
> nsuspend              1
> suspend_interval      00:05:00
> priority              7
> min_cpu_interval      00:05:00
> processors            UNDEFINED
> qtype                 BATCH
> ckpt_list             NONE
> pe_list               make
> rerun                 FALSE
> slots
> 1,[@allhosts_ikw-slots_1=1],[@allhosts_ikw-slots_2=2], \
>                         [@allhosts_ikw-slots_4=4],[@allhosts_ikw-slots_8=8]
> tmpdir                /work/tmp
> shell                 /bin/bash
> prolog                NONE
> epilog                NONE
> shell_start_mode      posix_compliant
> starter_method        NONE
> suspend_method        NONE
> resume_method         NONE
> terminate_method      NONE
> notify                00:00:60
> owner_list            NONE
> user_lists            GE-users
> xuser_lists           ikw nkg www-data
> subordinate_list      NONE
> complex_values        NONE
> projects              NONE
> xprojects             NONE
> calendar              NONE
> initial_state         enabled
> s_rt                  INFINITY
> h_rt                  INFINITY
> s_cpu                 INFINITY
> h_cpu                 INFINITY
> s_fsize               INFINITY
> h_fsize               INFINITY
> s_data                INFINITY
> h_data                INFINITY
> s_stack               INFINITY
> h_stack               INFINITY
> s_core                INFINITY
> h_core                INFINITY
> s_rss                 INFINITY
> h_rss                 INFINITY
> s_vmem                INFINITY
> h_vmem                INFINITY
>
>
> 2nd queue, nice value not working:
>
>
> qname                 ikw_longrun
> hostlist              @allhosts_ikw_longrun-slots_2 \
>                         @allhosts_ikw_longrun-slots_4 \
>                         @allhosts_ikw_longrun-slots_8
> seq_no                0
> load_thresholds       np_load_avg=1.75
> suspend_thresholds    NONE
> nsuspend              1
> suspend_interval      00:05:00
> priority              10
> min_cpu_interval      00:05:00
> processors            UNDEFINED
> qtype                 BATCH
> ckpt_list             NONE
> pe_list               make
> rerun                 FALSE
> slots                 1,[@allhosts_ikw_longrun-slots_2=2], \
>                         [@allhosts_ikw_longrun-slots_4=4], \
>                         [@allhosts_ikw_longrun-slots_8=8]
> tmpdir                /work/tmp
> shell                 /bin/bash
> prolog                NONE
> epilog                NONE
> shell_start_mode      posix_compliant
> starter_method        NONE
> suspend_method        NONE
> resume_method         NONE
> terminate_method      NONE
> notify                00:00:60
> owner_list            NONE
> user_lists            GE-users
> xuser_lists           ikw nkg www-data
> subordinate_list      NONE
> complex_values        NONE
> projects              NONE
> xprojects             NONE
> calendar              NONE
> initial_state         enabled
> s_rt                  INFINITY
> h_rt                  INFINITY
> s_cpu                 INFINITY
> h_cpu                 INFINITY
> s_fsize               INFINITY
> h_fsize               INFINITY
> s_data                INFINITY
> h_data                INFINITY
> s_stack               INFINITY
> h_stack               INFINITY
> s_core                INFINITY
> h_core                INFINITY
> s_rss                 INFINITY
> h_rss                 INFINITY
> s_vmem                INFINITY
> h_vmem                INFINITY
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=212810
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=212811
>
> To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=212828

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

    [ Part 2, "S/MIME Cryptographic Signature" ]
    [ Application/PKCS7-SIGNATURE (Name: "smime.p7s") 4.4 KB. ]
    [ Unable to print this part. ]



More information about the gridengine-users mailing list