[GE users] fix for qmaster crashes

fx d.love at liverpool.ac.uk
Fri Aug 20 17:26:23 BST 2010


This patch fixes the persistent qmaster crashes that I and others have
seen in recent versions.  It's been tested here and held up for a few
weeks with our job load (mainly tightly-integrated parallel).  I found
it by looking for relevant changes in the repo.

The problem had actually been fixed by one of the post-u5 commits
despite me being asked to do more-or-less impossible things to debug it
after spending some time trying to debug it myself in the absence of
guidance on how to instrument the code as the normal approach to track
down such things.  This is an example of why we need a free software
system, with people who know their way around the (rather inhospitable)
code-base outside the Oracle developers; count me in to the extent I
have time.

I suspect this patch won't make it intact through the mail list -- I
can't remember what happens -- in which case get it from
<http://www.nw-grid.ac.uk/LivPatches?action=AttachFile&do=get&target=sge-iz3215.diff>
for now.

------------------------------------------------------
http://gridengine.sunsource.net/ds/viewMessage.do?dsForumId=38&dsMessageId=275696

To unsubscribe from this discussion, e-mail: [users-unsubscribe at gridengine.sunsource.net].

    [ Part 2, "sge-iz3216.diff"  Text/X-DIFF (Name: "sge-iz3216.diff") ]
    [ ~1.7 KB. ]
    [ Unable to print this part. ]


    [ Part 3, "ATT00001.txt"  Text/PLAIN (Name: "ATT00001.txt") ~1.2 KB. ]
    [ Unable to print this part. ]



More information about the gridengine-users mailing list