[GE users] Why does this host is selected ?

igardais igardais at yahoo.fr
Fri May 29 15:48:44 BST 2009


Hi,

I'm getting a strange behavior from SGE.
A user submits a job to all.q@@first (and get jobid 30657).
As you can see in the qstat listing below, job 30657 runs on queue all.q@@br0st136.beicip.fr whereas another job (30621) was already running 100% on it *AND* there were other less-loaded nodes available.

If I display the job's details, it shows that the queue all.q at br0st136.beicip.fr was *not* selected because it was considered overloaded.

Any advices or clues about what is happening ?


Thanks,
Ionel


[gardais at br40pc11 ~]$ qstat -f -q all.q@@first
queuename                      qtype resv/used/tot. load_avg arch          states
---------------------------------------------------------------------------------
all.q at br0st110.beicip.fr       BIP   0/0/2          -NA-     -NA-          au
---------------------------------------------------------------------------------
all.q at br0st118.beicip.fr       BIP   0/0/2          -NA-     lx26-amd64    au
---------------------------------------------------------------------------------
all.q at br0st135.beicip.fr       BIP   0/0/2          1.00     lx26-amd64    
---------------------------------------------------------------------------------
all.q at br0st136.beicip.fr       BIP   0/2/2          2.17     lx26-amd64    a
  30621 0.50500 MM_NEWH5G  chautru      r     05/28/2009 17:33:28 MASTER        
  30657 0.50500 shell8     molinari     r     05/29/2009 16:14:53 MASTER        
                                                                  SLAVE         
---------------------------------------------------------------------------------
all.q at br0st137.beicip.fr       BIP   0/0/2          0.00     lx26-amd64    
---------------------------------------------------------------------------------
all.q at br0st138.beicip.fr       BIP   0/0/2          0.08     lx26-amd64    
---------------------------------------------------------------------------------
all.q at br0st139.beicip.fr       BIP   0/0/2          -NA-     lx26-amd64    au
---------------------------------------------------------------------------------
all.q at br0st140.beicip.fr       BIP   0/0/2          2.02     lx26-amd64    a
---------------------------------------------------------------------------------
all.q at br0st141.beicip.fr       BIP   0/0/2          -NA-     lx26-amd64    au
---------------------------------------------------------------------------------
all.q at br0st148.beicip.fr       BIP   0/0/2          0.44     lx26-amd64    
---------------------------------------------------------------------------------
all.q at br0st149.beicip.fr       BIP   0/0/2          0.00     lx26-amd64    
---------------------------------------------------------------------------------
all.q at br0st151.beicip.fr       BIP   0/0/2          0.01     lx26-amd64    
---------------------------------------------------------------------------------
all.q at br0st152.beicip.fr       BIP   0/0/2          0.01     lx26-amd64    
---------------------------------------------------------------------------------
all.q at br0st153.beicip.fr       BIP   0/0/2          0.02     lx26-amd64    
---------------------------------------------------------------------------------
all.q at br0st154.beicip.fr       BIP   0/0/2          -NA-     lx26-amd64    au
---------------------------------------------------------------------------------
all.q at br0st155.beicip.fr       BIP   0/0/2          0.00     lx26-amd64    
---------------------------------------------------------------------------------
all.q at br0st172.beicip.fr       BIP   0/0/2          0.00     lx26-amd64    
---------------------------------------------------------------------------------
all.q at br0st173.beicip.fr       BIP   0/0/2          0.03     lx26-amd64    
---------------------------------------------------------------------------------
all.q at br0st174.beicip.fr       BIP   0/0/2          -NA-     lx26-amd64    au
---------------------------------------------------------------------------------
all.q at br0st199.beicip.fr       BIP   0/0/2          -NA-     lx26-amd64    au
---------------------------------------------------------------------------------
all.q at br1st10.beicip.fr        BIP   0/0/2          0.06     lx26-amd64    
---------------------------------------------------------------------------------
all.q at br1st3.beicip.fr         BIP   0/0/2          0.01     lx26-amd64    
---------------------------------------------------------------------------------
all.q at br1st4.beicip.fr         BIP   0/0/2          0.00     lx26-amd64    
---------------------------------------------------------------------------------
all.q at br1st5.beicip.fr         BIP   0/0/2          0.00     lx26-amd64   

[gardais at br40pc11 ~]$ qstat -f -j 30657
[...]
scheduling info:            
[...]
                            queue instance "all.q at br0st136.beicip.fr" dropped because it is overloaded: np_load_avg=1.085000 (no load adjustment) >= 0.95
[...]

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=199697

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list