No subject


Wed Jan 12 20:38:46 GMT 2011


message as well):


MCE 5
CPU 0 4 northbridge TSC 31dabe2461ea
ADDR bfc10000
  Northbridge GART error
       bit61 = error uncorrected
  TLB error 'generic transaction, level generic'
STATUS a40000000005001b MCGSTATUS 0

MCE 0
CPU 2 4 northbridge TSC 285e6a6fa94d2
ADDR a5755780
Northbridge Chipkill ECC error
Chipkill ECC syndrome = 20e8
       bit32 = err cpu0
       bit46 = corrected ecc error
       bus error 'local node origin, request didn't time out
       generic read mem transaction
       memory access, level generic'
 STATUS 9474400120080813 MCGSTATUS 0

These seem to be the non-lethal errors. With a kernel panic I doubt that
there would be anything except a core file (and there's none). 

 Best, Martin

> -----Original Message-----
> From: Reuti [mailto:reuti at staff.uni-marburg.de] 
> Sent: 31 August 2007 12:24
> To: users at gridengine.sunsource.net
> Subject: Re: [GE users] Nodes dying with "Kernel panic" after 
> upgrade to 6.1u2
> 
> Hi,
> 
> Am 31.08.2007 um 12:24 schrieb Schenker, Martin:
> 
> > Yesterday we upgraded to 6.1u2 from 6.1. Shortly thereafter nodes 
> > started to die with "Kernel panic" messages on the screen:
> >
> >
> > CPU 0: Machine Check Exception:	4 Bank 4: b200000000070f0f
> > TSC 4ab5ba5354b
> > Kernel panic - not syncing: Machine check
> 
> this looks more like a) a kernel problem or b) a hardware problem.  
> Any upgrades to the kernel besides the SGE upgrade - which 
> Linux and kernel version are you using?
> 
> Is there anything in /var/log/mcelog?
> 
> -- Reuti
> 
> 
> > _
> >
> > We've now rolled back to 6.1 and are still testing. No hang-ups so 
> > far... Has anyone seen a similar behaviour?
> > We're running AMD64 Opterons (HP DL145) nodes with the
> > sge-6.1-bin-lx24-amd64 code.
> >
> > Best, Martin
> >
> > 
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> > For additional commands, e-mail: users-help at gridengine.sunsource.net
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
> For additional commands, e-mail: users-help at gridengine.sunsource.net
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe at gridengine.sunsource.net
For additional commands, e-mail: users-help at gridengine.sunsource.net




More information about the gridengine-users mailing list