[GE users] sge_qmaster 6.2u5 daemon: repeating segfaults

mhanby mhanby at uab.edu
Thu Apr 15 15:13:50 BST 2010


    [ The following text is in the "utf-8" character set. ]
    [ Your display is set for the "ISO-8859-10" character set.  ]
    [ Some characters may be displayed incorrectly. ]

I've noticed that the segfaults always appear following a reboot. We rebooted the sgemaster node 18 hours ago and have had 7 segfaults since.

Prior to the reboot the segfaulting had been dormant for several days at least.

-----Original Message-----
From: mhanby [mailto:mhanby at uab.edu] 
Sent: Monday, April 12, 2010 11:02 AM
To: users at gridengine.sunsource.net
Subject: RE: [GE users] sge_qmaster 6.2u5 daemon: repeating segfaults

Our cluster is CentOS 5.4 (using slightly older kernel 2.6.18-128.7.1.el5) and the segfualts behave as others have reported.

It'll start core dumping 'out of the blue' and go away just as mysteriously. I've been unable to identify any pattern of job types, users, etc.. during the periods of instability.

Mike

-----Original Message-----
From: fx [mailto:d.love at liverpool.ac.uk] 
Sent: Wednesday, April 07, 2010 8:23 AM
To: users at gridengine.sunsource.net
Subject: Re: [GE users] sge_qmaster 6.2u5 daemon: repeating segfaults

dom <marco.donauer at sun.com> writes:

> Hi,
>
> it looks like always SLES os are affected by this issue.

The other reports were on RH 5(-ish).  From the core dumps, I'd be
surprised if it's basically an OS problem.

> Do you know if there is anything happening before this segfault appears?
> Did anything change when the symptom disappears or when it came back?

The behaviour here made me suspicious that it's connected with issue
#3255.  However, with the cluster currently rather empty, I haven't been
able to provoke a crash with test jobs with similar requirements to what
I think were in the system at the time of previous crashes.

-- 
(Dr) Dave Love
?E-Science?, Computing Services Department, University of Liverpool
AKA fx at gnu.org

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=252564

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=253139

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=253528

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].



More information about the gridengine-users mailing list