[GE users] PE only offers 0 slots?

Ross Dickson Ross.Dickson at dal.ca
Fri Dec 14 17:16:39 GMT 2007


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi there.

We've added twenty 4-slot nodes to our production cluster, running 
6.0u9.  To validate the new nodes we submitted an 80-slot mpich job to 
techTeam.q, the only queue enabled on those nodes.  The job is not 
getting scheduled and we can't figure out why.  We have successfully run 
smaller mpich jobs which among them exercised all the nodes, and qstat 
-f shows all nodes available, with the right slot counts and no error 
states.

The only hint we can see is some nonsense from the scheduler.  Here's an 
excerpt from /opt/n1ge6u9/default/common/schedd_runlog:

Fri Dec 14 12:51:55 2007|Job 5625 cannot run in PE "mpich" because it 
only offers 0 slots
Fri Dec 14 12:51:55 2007|Job 5626 cannot run in PE "mpich" because it 
only offers 0 slots

I get the same from qstat -j on the wedged jobs. 

(1) Why is our scheduler emitting this nonsense about 0 slots?
(2) What do I need to do to figure out why our test job won't run?

Thanks,

-- 
Ross Dickson         Computational Research Consultant
ACEnet               http://www.ace-net.ca
+1 902 494 6710      Skype: ross.m.dickson

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list