[GE users] invalid pe job range setting for job

reuti reuti at staff.uni-marburg.de
Fri Nov 6 17:54:06 GMT 2009


Hi,

Am 06.11.2009 um 12:05 schrieb buudo:

> Dear all,
> my new 6.2u4 is running, but approx. ones a day the sge_qmaster  
> daemon crashes. The only strange messages  I found are in the  
> message file of the qmaster loke this:
> -----------
> 11/06/2009 11:31:30|schedu|t2f|E|invalid pe job range setting for  
> job 166

what is your setting of "slots" in the PE? This is the number of  
slots which can be used at the same time in this PE across all  
running jobs.

You can get this error when you define in the PE "slots 4" but submit  
jobs with "qsub -pe ... 8 ...". This shouldn't crash the qmaster though.

-- Reuti


>
> 11/06/2009 11:31:30|schedu|t2f|E|invalid pe job range setting for  
> job 234
> -----------
> parallel jobs are running.
> Does anyone knows the reason ?
> here are my settings:
> pe_name            openmpi_12
> slots              277
> user_lists         defaultdepartment standard
> xuser_lists        NONE
> start_proc_args    /bin/true
> stop_proc_args     /bin/true
> allocation_rule    $fill_up
> control_slaves     TRUE
> job_is_first_task  FALSE
> urgency_slots      min
> accounting_summary FALSE
> ----------
> qname                 par
> hostlist              @c_12 @c_8
> seq_no                100
> load_thresholds       np_load_avg=1.5
> suspend_thresholds    NONE
> nsuspend              1
> suspend_interval      00:05:00
> priority              0
> min_cpu_interval      00:05:00
> processors            UNDEFINED
> qtype                 NONE
> ckpt_list             NONE
> pe_list               make openmpi_12 vasp_12 vasp_8
> rerun                 FALSE
> slots                 12,[@c_8=8]
> tmpdir                /scratch
> shell                 /bin/bash
> prolog                NONE
> epilog                NONE
> shell_start_mode      unix_behavior
> starter_method        NONE
> suspend_method        NONE
> resume_method         NONE
> terminate_method      NONE
> notify                00:00:60
> owner_list            NONE
> user_lists            defaultdepartment standard
> xuser_lists           NONE
> subordinate_list      NONE
> complex_values        virtual_free=32.0G
> projects              NONE
> xprojects             NONE
> calendar              NONE
> initial_state         default
> s_rt                  INFINITY
> h_rt                  INFINITY
> s_cpu                 INFINITY
> h_cpu                 INFINITY
> s_fsize               INFINITY
> h_fsize               INFINITY
> s_data                INFINITY
> h_data                INFINITY
> s_stack               INFINITY
> h_stack               INFINITY
> s_core                INFINITY
> h_core                INFINITY
> s_rss                 INFINITY
> h_rss                 INFINITY
> s_vmem                INFINITY
> h_vmem                INFINITY
> ---------------
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=225338
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=225425

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list