[GE users] cannot run in PE "smp" because it only offers -2147483648 slots

reuti reuti at staff.uni-marburg.de
Tue Feb 24 21:49:23 GMT 2009


Am 24.02.2009 um 22:06 schrieb kdoman:

> Sorry, I ran a test case on development cluster where each node has
> only two core. That's why all.q has slot of 2. This cluster has 64
> nodes and it's mostly idle so I can do whatever I want. The slot is
> correct at 2

Aha, "qhost" and "qstat -f" show all nodes online and no queue  
disabled? Any RQS in place, i.e. "qquota" empty?

-- Reuti

> I'm running GE 6.1u4 on CentOS 5.2.
>
> # qconf -sp smp
> pe_name           smp
> slots             128
> user_lists        NONE
> xuser_lists       NONE
> start_proc_args   /bin/true
> stop_proc_args    /bin/true
> allocation_rule   $pe_slots
> control_slaves    FALSE
> job_is_first_task TRUE
> urgency_slots     min
>
> Simple sleep job (supposedly):
> =====================
> # cat sleep.sh
> #!/bin/bash
>
> #$ -pe smp 2
> #$ -cwd
> #$ -q long.q
> #$ -R y
> sleep 60
>
> I can run a one-liner qsub and still get the same error:
> qsub -cwd -b y -pe smp 2 sleep 60
>
> Thanks!
> K.
>
>
> On Tue, Feb 24, 2009 at 2:49 PM, reuti <reuti at staff.uni-marburg.de>  
> wrote:
>> Hiho,
>>
>> Am 24.02.2009 um 21:10 schrieb kdoman:
>>
>>> hello list -
>>> I need to submit only one job to one machine even though the machine
>>> has four cores. So I ran the command "qconf -ap smp", edit the
>>> slots=1000, saved and added smp to the queue (via qconf -mq):
>>>
>>> qconf -sp smp:
>>> ==============
>>> pe_name           smp
>>> slots             1000
>>
>> 1000 is of course save, although no. of nodes x 4 would do.
>>
>>> user_lists        NONE
>>> xuser_lists       NONE
>>> start_proc_args   /bin/true
>>> stop_proc_args    /bin/true
>>> allocation_rule   $pe_slots
>>> control_slaves    FALSE
>>> job_is_first_task TRUE
>>> urgency_slots     min
>>>
>>> qconf -sq all.q
>>> ==============
>>> qname                 all.q
>>> hostlist              @allhosts
>>> seq_no                0
>>> load_thresholds       np_load_avg=1.75
>>> suspend_thresholds    NONE
>>> nsuspend              1
>>> suspend_interval      00:05:00
>>> priority              0
>>> min_cpu_interval      00:05:00
>>> processors            UNDEFINED
>>> qtype                 BATCH INTERACTIVE
>>> ckpt_list             NONE
>>> pe_list               make mpich mpi orte smp
>>> rerun                 FALSE
>>> slots                 2
>>
>> If all machines have 4 cores, you can just put here 4. Otherwise you
>> would need to specify this by node or hostgroup in a heterogenous
>> cluster.
>>
>>> .
>>> .
>>> etc...
>>>
>>> After I submitted the jobs, all jobs stayed in the 'qw' states.  
>>> qstat
>>> -j <job-id> gave me this:
>>> cannot run in PE "smp" because it only offers -2147483648 slots
>>
>> On which platform / OS / SGE version do you observe this?
>>
>> -- Reuti
>>
>>>
>>> Thanks all.
>>>
>>> ------------------------------------------------------
>>> http://gridengine.sunsource.net/ds/viewMessage.do?
>>> dsForumId=38&dsMessageId=113694
>>>
>>> To unsubscribe from this discussion, e-mail: [users-
>>> unsubscribe at gridengine.sunsource.net].
>>
>> ------------------------------------------------------
>> http://gridengine.sunsource.net/ds/viewMessage.do? 
>> dsForumId=38&dsMessageId=113718
>>
>> To unsubscribe from this discussion, e-mail: [users- 
>> unsubscribe at gridengine.sunsource.net].
>>
>
> ------------------------------------------------------
> http://gridengine.sunsource.net/ds/viewMessage.do? 
> dsForumId=38&dsMessageId=113727
>
> To unsubscribe from this discussion, e-mail: [users- 
> unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=113753

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list