[GE users] qlogin put node in Error state

gg3796 gg3796 at yahoo.com
Mon Oct 11 19:04:57 BST 2010


    [ The following text is in the "iso-8859-1" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some special characters may be displayed incorrectly. ]

Thanks Reuti:

I am using builtin:

qlogin_command               builtin
qlogin_daemon                builtin
rlogin_command               builtin
rlogin_daemon                builtin
rsh_command                  builtin
rsh_daemon                   builtin

local_configuration for hosts doesn't have any thing related, only following 3 lines
mailer                       /bin/mail
xterm                        /usr/bin/xterm
execd_spool_dir              /var/sge/6.2u3/california/spool/

I can ssh to the hosts without any problem. It was all working well until I upgraded all submit and executaion hosts to rhel5.4. One thing I would like to mention is SGEMASTER  is still running RHEL4.X. Do you think that may be the problem.

Regards,
Babar


________________________________
From: reuti <reuti at staff.uni-marburg.de>
To: users at gridengine.sunsource.net
Sent: Mon, October 11, 2010 2:19:50 AM
Subject: Re: [GE users] qlogin put node in Error state

Hi,

Am 09.10.2010 um 05:02 schrieb gg3796:

> I am running 6.2u3. since we upgraded our  Desktops and Servers to RHEL5.4 qlogin put the Exec host to E state.

what is your startup method for `qlogin` (`qconf -sconf` and/or the local configuration of each exechost)? I would assume, that the "telnetd" or "telnet" wasn't installed and you are not using -builtin-. NB: "telnetd" can stay disabled in /etc/xinit.d/telnetd as SGE will start its own instance of `telnetd`.

-- Reuti



> The only message is see in the exec host spool message file is:
> 10/08/2010 19:49:35|  main|cluster-1|E|shepherd of job 4456333.1 exited with exit status = 11
>
>
> The job status email has following lines in it:
>
>
> Job 4456333 caused action: Queue "pd.q at cluster-1.xyz.com<mailto:pd.q at cluster-1.xyz.com>" set to ERROR
>
> User = babar
>
> Queue = pd.q at c8-1.xyz.com<mailto:pd.q at c8-1.xyz.com>
>
> Start Time = <unknown>
>
> End Time = <unknown>
>
> failed before job:10/08/2010 19:49:34 [511:4487]: startup of qrsh job failed:
>
>
>
>
>
> Thanks,
>
> Babar
>
>
>
>
>

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=286475

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net<mailto:users-unsubscribe at gridengine.sunsource.net>].




More information about the gridengine-users mailing list