DIF errros in ESXi log.

Have you ever saw DIF ERROR in yours ESXi logs. This is something you should probably start to worry about.

If you have never heard about DIF, it is optional feature for disk systems and communications that extend the SCSI standard to provide end-to-end protection of user data. So it provide protection in case of media and transmission errors.

DIF extend the disk sector from 512 bytes to 520 bytes. Needs support from all elements in infrastructure (especially including storage systems and OS drivers(!))

Normally new standard is not an issue until they are entered through the back door.

I saw situation when the new storage attached to the environment was a trigger that bad things started to happen. Other situation happens after ESXi HBA firmware/driver upgrade. Both are connected due the incorrect DIV handling by HBA card (qlogic).

First case was similar to this described here: https://vnote42.net/2020/08/27/esxi-storage-connection-problems-after-installing-a-new-array/

So the customer bought a storage. After storage was prepared in environment and ready to move load to this, whole vSphere environment started to behave unpredictably.

Performance was slow, virtual systems started crashing  (randomly) and eventually ESXi randomly freezes too.

In vmkernel.log lots of entries like this:

DIF ERROR in cmd: 0x28 Type=0x0 lba=0xb100 actRefTag=0x1000000, expRefTag=0xb100, actAppTag=0x0, expAppTag=0x0, actGuard=0x400, expGuard=0xa671

Please check kb: https://kb.vmware.com/s/article/80237

As destribed in this article, new qlogic drivers fix this errors. Other soluton (if for some reason you can’t do upgrade) is to disable t10dif on the driver level:

esxcfg-module -s “ql2xt10difvendor=0” qlnativefc.

Useful links:

https://kb.vmware.com/s/article/2113956

https://kb.vmware.com/s/article/2113956

https://kb.vmware.com/s/article/80237

https://h20195.www2.hpe.com/v2/getpdf.aspx/4aa3-3516enw.pdf

https://en.wikipedia.org/wiki/Data_Integrity_Field

https://www.t10.org/ftp/t10/document.03/03-111r0.pdf

T10 DIF (Data Integrity Field)

No Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Linux
Migrate WordPress site to another hosting service.

IntroductionThis article details the migration of WordPress site (exactly this site you are now on) from one service provider to Amazon Lightsail. There might be various reason to do that (mine is outlined below) but in general I hope to share the message that especially with WordPress, migration can be …

VMware
VMware Workstation and Fusion can be installed and use for free (even for the enterprise)

For a while now, the VMware Workstation (and Fusion for MacOS) can be used without any additional fee for Personal use. That was a great Broadcom news and nice gesture from that software vendor. Recently Broadcom announced that the software will be available for all, even the commercial sector. This …

Linux
Salt, VMware implementation – part 1, introduction

As every IT administrator knows, the infrastructure (meaning storages, compute, VMware virtualisation stack) is just a fundaments to run various operating systems (OS) and finally (containerized) application. Therefore, installation of (let’s call it) infrastructure in the datacenter (SDDC), in that sense is just the beginning of the adventure. No wonder, …