Quantcast
Channel: VMware Communities: Message List
Viewing all articles
Browse latest Browse all 230663

PowerEdge 2950: ESXi 6.0 boot: VMB: 85 Halting. E1420 CPU BUS PERR.

$
0
0

A PowerEdge 2950 II running VMware ESXi, 6.0.0, 5050593 Image Profile (Updated) ESXi-6.0.0-20170202001-standard has been running without issue for quite some time, and the underlying hardware has had no issues for several years.  Recently, an Intel 350T2V2 NIC was installed and configured for use, then a Dell SAS 6 GB HBA External Controller Card 7RJDT was installed.  Neither installation had a negative impact on system stability.

 

Next, upon replacing four (4) Crucial 4GB 240 Pin 512Mx72 DDR2 PC2-5300 CL5 ECC DIMMs with eight (8) A-TECH 8G DDR2 PC2-5300 ECC FULLY BUFFERED DIMMs, the BIOS memory check passed, but seemed to proceed very (very) slowly.  ESXi started to boot, but took an extraordinarily (very) long time at the /sb.v00 and /s.v00 steps of the "Loading VMware Hypervisor" stages.  Eventually, and a (very) long time later, a message appeared stating "Relocating modules and starting up the kernel...".  Again, a significant amount of time transpired.  Then, the screen blacked out and this:

 

VMB: 398: Unexpected exception 2 @0x41800e06957e

VMB: 405: cr0 0x8001003d cr2 0x0 cr3 0x100803000 cr4 0x30

VMB: 407: error code 0x2 rip 0x41800001eee0 cs 0x8

VMB: 409: rflags 0x86 rsp 0x42800001eee0 ss 0x0

VMB: 411: rax 0x12345678 rcx 0x101ffff rdx 0xffff4c000

VMB: 413: rbx 0x0 rbp 0x0 rsi 0x1000

VMB: 415: rdi 0xffff81100004c000 r8 0x2 r9 0x23

VMB: 417: r10 0x8000000000000003 r11 0x0 r12 0xffff4c

VMB: 419: r13 0x420000045221 r14 0xd r15 0x0

VMB: 420: gs 0x10 fs 0x10

VMB: 422: FSbase:0x0 GSase:0x417rce236200 kernelGSbase:0x0

VMB: 139: [0x42800001eee0] 0x41800e06957e

VMB: 139: [0x42800001ef00] 0x41800e06a0ad

VMB: 139: [0x42800001ef900] 0x41800e814c24

VMB: 139: [0x42800001efc0] 0x41800e000fb8

VMB: 85: Halting.

 

At the same time, the PowerEdge 2950 front panel LCD switch from blue to amber and reported:

 

  E1420 CPU BUS PERR

 

At this point the system is dead and must be powered off.

 

The RAC System Event Log shows entries like:

 

  Entry 007 of 007

  Severity: Non-Recoverable

  Date and Time: Wed May 10 13:48:12 2017

  Description:

  CPU Bus PERR: Processor sensor, transition to

  non-recoverable was asserted.

 

Dell forums show a flurry of PowerEdge 1950/2950 CPU Bus PERR reports in the Apr-May 2008 time frame, but no conclusive resolutions were spotted, though it seemed apparent Dell acknowledged an issue at some point and RHEL issued a related OS patch at some point. Xeon E5xxx processors were mentioned and this one has Xeon E5345 CPUs.  Various posts seemed to suggest the issue might be related to virtualization.

 

Various BIOS setting changes have been tested per a number of Dell / VMware forum posts to no avail.

 

The system successfully boots a CentOS 7 1503 Live KDE 64-bit and CentOS 6.5 Live KDE 32-bit DVDs, though one gets an impression that possibly the system is running a slow.

 

One is led to suspect the new DIMMs triggered this situation, but it seems over hasty to remove a 64GB upgrade and return to a 16GB configuration since 16GB RAM is not going to support VMs planned for this system.  To this end, research continues.


Viewing all articles
Browse latest Browse all 230663

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>