[GE users] Help: Load Average Problem

Lee Amy openlinuxsource at gmail.com
Mon Jul 28 15:40:55 BST 2008


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hello,

My cluster has five nodes and each one has 2 dual core Opteron 270 HE
processors. I set up 19 slots when I configured SGE. And I face some
problems which make me puzzled.

1. The first node has 3 slots because I use it as the master host. But when
I submit a parallel jobs within the tight integration parallel environment
with 19 slots on the first node, the first node has 4 parallel programs and
the last node has 3 programs only. So I guess that the master node should
has 3 slave programs and other hosts should full of programs. I use Open MPI
parallel environment.

2. When I submit one job sometime, for example it's dispatched in node2,
after several minutes later I submit another job, it's still dispatched in
node2. I use qhost command to see the Load Average I find that it has
reached 2.30. So I think it's not normal because SGE should pick up a
low-load host to dispatch jobs. By the way, could you tell me when SGE
dispatches jobs what elements depend on?

Thank you very much~


Best Regards,

Amy Lee



More information about the gridengine-users mailing list