[GE users] Transfer queue(s)

Marcel Turcotte turcotte at site.uottawa.ca
Thu Oct 7 21:16:50 BST 2004


Charu,

> The load_avg 99.99 and the state "au" indicates that the sge_execd daemon is
> either down, or else unable to connect to the qmaster (assuming the system
> itself is not having problems).  Please check why this is the case first.

An sge_execd deamon is running on simorgh, with its SGE_CELL pointing
at clicc_cell.

simorgh(599) % qhost
HOSTNAME             ARCH       NPROC  LOAD   MEMTOT   MEMUSE   SWAPTO  
SWAPUS
-------------------------------------------------------------------------------
global               -              -     -        -        -       
-        -
arash                solaris64      1  0.02   640.0M   179.0M  
960.0M      0.0
bijan                solaris64      1     -   640.0M        -  
960.0M        -
rostam               solaris64      1  0.05   640.0M   372.0M  
960.0M      0.0
simorgh              solaris64      4  0.02    24.0G     3.4G   
15.6G      0.0
sohrab               solaris64      1  0.11   640.0M   351.0M  
960.0M      0.0
zal                  solaris64      1  0.04   640.0M   195.0M  
960.0M      0.0

On homa, SGE_CELL points at homa_cell

homa(519) % qhost
HOSTNAME             ARCH       NPROC  LOAD   MEMTOT   MEMUSE   SWAPTO  
SWAPUS
-------------------------------------------------------------------------------
global               -              -     -        -        -       
-        -
amylase              solaris64      1  0.04   640.0M   186.0M  
960.0M    27.0M
bossa                solaris64      1  0.03   640.0M   153.0M  
960.0M      0.0
enolase              solaris64      1  0.04   640.0M   189.0M  
960.0M    20.0M
frevo                solaris64      1  1.06   640.0M   363.0M  
960.0M     1.0M
homa                 solaris64      8  0.07    32.0G     3.1G   
15.6G      0.0
insulin              solaris64      1  0.24   640.0M   577.0M  
960.0M     2.0M
rubisco              solaris64      1  0.05   640.0M   164.0M  
960.0M      0.0
salsa                solaris64      1  0.02   640.0M   213.0M  
960.0M    19.0M
samba                solaris64      1  0.04   640.0M   170.0M  
960.0M    40.0M
simorgh              -              -     -        -        -       
-        -
tango                solaris64      1  0.02   640.0M   219.0M  
960.0M      0.0
trypsin              solaris64      1  0.04   640.0M   157.0M  
960.0M    25.0M

Can this be the source of the problem?

Cheers,

M.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list