[GE users] More startup oddness

Michal Bachorik Michal.Bachorik at Sun.COM
Mon Sep 1 16:09:13 BST 2008


    [ The following text is in the "ISO-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Hi James,

I think that "linux6" should not be in the alias for 127.0.0.1 but for 
real IP. Try to change your /etc/hosts to:

127.0.0.1 localhost 
192.168.1.2 linux6

Of course, change 192.168.1.2 to real ip of your nic.


Regards,
Michal 


James Gibbon wrote:
> Hi,
>
> I'm getting an unusual error on starting the grid engine service
> on the qmaster:
>
>  root at linux6:~# /etc/init.d/sgemaster start
>
>  sge_qmaster didn't start!
>  This is not a qmaster host!
>  Please, check your act_qmaster file!
>
>
> .. so to get a bit more information, 
>
>  root at linux6:/home/pipelines/SunGE#  export SGE_ND=1
>  root at linux6:/home/pipelines/SunGE# ./bin/lx24-amd64/sge_qmaster 
>  Reading in complex attributes.
>  Reading in execution hosts.
>  Reading in administrative hosts.
>  Reading in submit hosts.
>  Reading in host group entries:
>          Host group entries for group "@allhosts".
>  Reading in usersets:
>          Userset "defaultdepartment".
>          Userset "deadlineusers".
>  Reading in queues:
>          Queue "all.q".
>  error: cannot recreate queue all.q from disk because of unknown host linux6
>  read job database with 0 entries in 0 seconds
>  Reading in users:
>          User "dan".
>  qmaster hard descriptor limit is set to 8192
>  qmaster soft descriptor limit is set to 8192
>  qmaster will use max. 8172 file descriptors for communication
>  qmaster will accept max. 99 dynamic event clients
>  starting up 6.0u8
>  error: commlib error: local host name error (remote destination host name "linux6" is not equal to local resolved host name "localhost")
>  error: can't create job sequence number file "jobseqnum": Permission denied - delaying until next job
>
> .. it hangs at this point. The host is known as 'linux6', and this is
> what's returned by the 'hostname' command. The act_qmaster file contains
> simply 'linux6'.
>
> Why is the startup resolving the hostname as 'localhost'?
> First line of /etc/hosts is:
>
> 127.0.0.1 localhost linux6
>
> .. any suggestions?
>
> Thanks,
> James
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list