Quantcast
Channel: VMware Communities : All Content - All Communities
Viewing all articles
Browse latest Browse all 207710

ESXi 5 I/O Errors (HP P2000 G3 SAS)

$
0
0

I have a support case open for this, but I wanted to throw this out there and see if anyone else has come across anything like this.

 

This is my build

 

  • VMware Version: ESXi 5.0.0, 474610
  • Host Hardware: Cisco UCS C210 M2 (R210-2121605W)
    • Firmware: 1.4 (1c)
    • Proc: Intel Xeon X5650 @ 2.67GHz (x2)
    • Memory: 98GB
    • PCIe SAS Controller: LSI MegaRaid SAS 9280-414e
  • SAN: HP P2000 G3 SAS
    • Firmware: TS230P03
    • Disk Slots: 24x 300GB
    • Vdisk1: 1498.4GB (6x disk RAID5)
    • Vdisk2: 1498.4GB (6x disk RAID5)
    • Vdisk3: 1498.4GB (6x disk RAID5)
    • Vdisk4: 599.4GB (6x disk RAID10)

 

The issue that we are currently seeing is with I/O. Guests performance will degrade to the point that they stop responding. The Hosts will then in turn degrade until they too stop responding, and we are forced to physically power down the server.

 

Once everything comes back up, things start working normally again. But inevitably the cycle will start all over again hours or days later.

 

While monitoring the system, after contacting support, we saw continous rests to the local MegaRaid SAS storage device.

 

MACS02ESX01_MegaRaidAborts.JPG

 

So far we have tested the following..

 

  • Upgrade the Megaraid SAS Controller Drivers from 4.32 (default drivers installed by ESX) to 5.34
    • After the host rebooted it was completely unable to connect to the SAN so we rolled it back to a previous install
  • Checked Caching
    • Since we don't have Battery Backup Units installed on the SAS Controllers and we have Caching disabled in the Windows Guest OS, we are leaving all caching up to the SAN.
  • Disabled HardwareAcceleratedLocking
    • Advanced Settings > VMFS3 > HardwareAdvancedLocking set to 0

 

After reviewing the HP notes on the P2000 compatibility with ESXi 4.1, I'm wondering if there are settings in the SAN we need to change, or if there was something else within ESXi5, that we need to look at.

 

At this point if we are also to the point where we will need to take one host down and reload it with ESX 4.1 to rule out this being an ESXi 5 issue. Before we go through that though, I just wanted to see if there was any feedback form the Community or any recommendations. We are trying to avoid having to bring down a production environment and retrograde the Hosts, VM versions, etc..

 

I appreciate any comments


Viewing all articles
Browse latest Browse all 207710

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>