[GE users] 6.2 sgeexecd fails to keep running: can't connect to service

Harry Mangalam harry.mangalam at uci.edu
Tue Nov 18 20:06:48 GMT 2008


I have 2 subclusters (different archs) running under 6.2.  When I try 
to start sgeexecd on subcluster bduc-i32, sgeexecd starts and then 
fails after a minute or so.  The only message I can see is 
in /tmp/execd_messages.nnnnn:

11/18/2008 11:51:32|  main|bduc-i32-16|E|can't connect to service
11/18/2008 11:51:32|  main|bduc-i32-16|E|can't get configuration from 
qmaster -- backgrounding

the bduc-amd64 subcluster (oddly, the one 'further away' on a public 
IP net) works fine and the output of qhost shows: 

HOSTNAME      ARCH       NCPU  LOAD MEMTOT  MEMUSE SWAPTO SWAPUS
----------------------------------------------------------------
global        -             -     -      -       -      -      -
bduc-amd64-1  lx24-amd64    2  0.00   3.9G  152.0M   1.0G    0.0
bduc-amd64-10 lx24-amd64    2  0.00   2.0G  148.1M   1.0G    0.0
bduc-amd64-11 lx24-amd64    2  0.00   3.9G  149.3M   1.0G    0.0
bduc-amd64-12 lx24-amd64    2  0.00   3.9G  148.6M   1.0G    0.0
bduc-amd64-13 lx24-amd64    2  0.00   3.9G  148.4M   1.0G    0.0
 ...
bduc-i32-10   lx24-x86      2     -   2.0G       -   3.9G      -
bduc-i32-11   lx24-x86      2     -   4.0G       -   3.9G      -
bduc-i32-12   lx24-x86      2     -   4.0G       -   3.9G      -
bduc-i32-13   lx24-x86      2     -   4.0G       -   3.9G      -
bduc-i32-14   lx24-x86      2     -   4.0G       -   3.9G      -

indicating the failure of sgeexecd to run on the i32 nodes.

Is this sound like a name resolution problem?  Or something else?  No 
firewall are involved AFAIK.

-- 
Harry Mangalam - Research Computing, NACS, E2148, Engineering Gateway, 
UC Irvine 92697  949 824-0084(o), 949 285-4487(c)
---
Good judgment comes from experience; 
Experience comes from bad judgment. [F. Brooks.]

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=88996

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list