Introduction
Cumulus Linux and the included Mellanox Hardware Management package (hw-mgmt) heavily rely on information from the BIOS DMI / SMBIOS to determine the system type / hardware configuration.
If this information is wrong (or rather: incompatible), you can have all kinds of funny issues, like the fans constantly running at full speed because the ASIC temperature sensor is not initialized correctly, or the thermal control loop using wrong thresholds or PWM addresses…
I’ve had some cases with OEM switches, like the HPE SN3700cM or some older SN2700s, where the DMI information was not compatible with Cumulus Linux, leading to issues like described above. I have also had this happen on SN2410 switches after some kind of failed automatic BIOS update through Onyx.
Fixing the DMI information is not exactly trivial, because, as always, it requires some $magic proprietary tools.
Fixing the DMI Information
At least the SN2700 and SN3700 switches have a bog-standard AMI BIOS. Even though access to this information is provided via /sys/devices/virtual/dmi/
, it seems like this is not writable.
To actually change the DMI informaiton, you need the amideefix64.efi
tool. It’s not exactly easy to find, but Another blog post guided me to the right place. It is available in Lenovo Update Packages, such as m1ujt77usa.zip
Preparing the USB Stick
Download m1ujt77usa.zip
and extract it. Prepare a FAT32 formatted USB stick and copy the amideefix64.efi
file and the efi
directory to the root of the USB stick.
Booting the USB Stick
Insert the USB stick into the switch and reboot. On Mellanox switches, you can enter the BIOS by pressing Ctrl+B
during the boot process. The BIOS password is usually admin
Changing the DMI Information
Once you’ve booted from the USB stick, you should be in a EFI shell. Usually you have to type
FS0:
to change to the USB stick. Then you can run the amideefix64.efi
tool.
Full Usage
FS0:\> amideefix64.efi
+---------------------------------------------------------------------------+
| AMI Desktop Management Interface Edit Utility v5.27.00.0003 |
| Copyright (c) 1985-2020, American Megatrends International LLC. |
| All rights reserved. Subject to AMI licensing agreement. |
+---------------------------------------------------------------------------+
| Commands: |
| /ALL [FileName] Output SMBIOS string to screen/file. |
| /DMS [FileName] Create configuration file. |
| /DUMPALL [FileName] Output all SMBIOS data to screen/file. |
| /DUMP # [#] ... Read Type # data. |
| Options: |
| /IVN ["String"] Read/Write BIOS vendor name in Type 0. |
| /IV ["String"] Read/Write BIOS version in Type 0. |
| /ID ["String"] Read/Write BIOS release date in Type 0. |
| /SM ["String"] Read/Write System manufacture in Type 1. |
| /SP ["String"] Read/Write System product in Type 1. |
| /SV ["String"] Read/Write System version in Type 1. |
| /SS ["String"] Read/Write System Serial number in Type 1. |
| /SU [16 Bytes] Read/Write System UUID in Type 1. |
| /SU AUTO Generates system UUID automatically and update Type 1. |
| /SK ["String"] Read/Write System SKU number in Type 1. |
| /SF ["String"] Read/Write System family in Type 1. |
| /BM ["String"] Read/Write Baseboard manufacture in Type 2. |
| /BP ["String"] Read/Write Baseboard product in Type 2. |
| /BV ["String"] Read/Write Baseboard version in Type 2. |
| /BS ["String"] Read/Write Baseboard Serial number in Type 2. |
| /BT ["String"] Read/Write Baseboard Asset Tag in Type 2. |
| /BLC ["String"] Read/Write Baseboard Loc. in Chassis in Type 2. |
| /BMH <handle #> ["String"] |
| Read/Write Baseboard manufacture in Type 2. |
| /BPH <handle #> ["String"] |
| Read/Write Baseboard product in Type 2. |
| /BVH <handle #> ["String"] |
| Read/Write Baseboard version in Type 2. |
| /BSH <handle #> ["String"] |
| Read/Write Baseboard Serial number in Type 2. |
| /BTH <handle #> ["String"] |
| Read/Write Baseboard Asset Tag in Type 2. |
| /BLCH <handle #> ["String"] |
| Read/Write Baseboard Loc. in Chassis in Type 2. |
| /CM ["String"] Read/Write Chassis manufacture in Type 3. |
| /CT [8 Bits] Read/Write Chassis type in Type 3. |
| /CV ["String"] Read/Write Chassis version in Type 3. |
| /CS ["String"] Read/Write Chassis Serial number in Type 3. |
| /CA ["String"] Read/Write Chassis Tag number in Type 3. |
| /CO [32 Bits] Read/Write Chassis OEM-defined value in Type 3. |
| /CH [8 Bits] Read/Write Chassis Height in Type 3. |
| /CPC [8 Bits] Read/Write Chassis Power Cords number in Type 3. |
| /CSK ["String"] Read/Write Chassis SKU number in Type 3. |
| /CMH <handle #> ["String"] |
| Read/Write Chassis manufacture in Type 3. |
| /CTH <handle #> [8 bits] |
| Read/Write Chassis type in Type 3. |
| /CVH <handle #> ["String"] |
| Read/Write Chassis version in Type 3. |
| /CSH <handle #> ["String"] |
| Read/Write Chassis Serial number in Type 3. |
| /CAH <handle #> ["String"] |
| Read/Write Chassis Tag number in Type 3. |
| /COH <handle #> [32 bits] |
| Read/Write Chassis OEM-defined value in Type 3. |
| /CHH <handle #> [8 bits] |
| Read/Write Chassis Height in Type 3. |
| /CPCH <handle #> [8 bits] |
| Read/Write Chassis Power Cords number in Type 3. |
| /CSKH <handle #> ["String"] |
| Read/Write Chassis SKU number in Type 3. |
| /PSN ["String"] Read/Write Processor serial number in Type 4. |
| /PAT ["String"] Read/Write Processor asset tag in Type 4. |
| /PPN ["String"] Read/Write Processor part number in Type 4. |
| /PSNH <handle #> ["String"] |
| Read/Write Processor serial number in Type 4. |
| /PATH <handle #> ["String"] |
| Read/Write Processor asset tag in Type 4. |
| /PPNH <handle #> ["String"] |
| Read/Write Processor part number in Type 4. |
| /OS [<Number> <"String">] |
| Read/Write OEM string in Type 11. |
| /SCO [<Number> <"String">] |
| Read/Write Sys. Configuration Op. in Type 12. |
| /PBL <handle #> ["String"] |
| Read/Write Port. Battery Location in Type 22. |
| /PBM <handle #> ["String"] |
| Read/Write Port. Battery Manufacturer in Type 22. |
| /PBD <handle #> ["String"] |
| Read/Write Port. Battery ManuDate in Type 22. |
| /PBS <handle #> ["String"] |
| Read/Write Port. Battery Serial Number in Type 22. |
| /PBN <handle #> ["String"] |
| Read/Write Port. Battery Device Name in Type 22. |
| /PBCH <handle #> [8 Bits] |
| Read/Write Port. Battery Device Chemistry in Type 22. |
| /PBCA <handle #> [16 Bits] |
| Read/Write Port. Battery Design Capacity in Type 22. |
| /PBV <handle #> [16 Bits] |
| Read/Write Port. Battery Design Voltage in Type 22. |
| /PBSV <handle #> ["String"] |
| Read/Write Port. Battery SBDS Ver. Num. in Type 22. |
| /PBE <handle #> [8 Bits] |
| Read/Write Port. Battery Maxmum Error in Type 22. |
| /PBSN <handle #> [16 Bits] |
| Read/Write Port. Battery in SBDS Ser. Num. in Type 22. |
| /PBSD <handle #> [16 Bits] |
| Read/Write Port. Battery in SBDS Manu. Date. in Type 22. |
| /PBSC <handle #> ["String"] |
| Read/Write Port. Battery in SBDS Dev. Chem. in Type 22. |
| /PBCM <handle #> [8 Bits] |
| Read/Write Port. Battery in Design Cap Multi in Type 22. |
| /PBO <handle #> [32 Bits] |
| Read/Write Por. Bat. in OEM-Specific Type 22. |
| /PU <handle #> [8 Bits] |
| Read/Write Power supply unit group in Type 39. |
| /PL <handle #> ["String"] |
| Read/Write Power supply location in Type 39. |
| /PD <handle #> ["String"] |
| Read/Write Power supply device name in Type 39. |
| /PM <handle #> ["String"] |
| Read/Write Power supply manufacturer in Type 39. |
| /PS <handle #> ["String"] |
| Read/Write Power supply serial number in Type 39. |
| /PT <handle #> ["String"] |
| Read/Write Power supply asset tag number in Type 39. |
| /PN <handle #> ["String"] |
| Read/Write Power supply model part number in Type 39. |
| /PR <handle #> ["String"] |
| Read/Write Power supply revision level in Type 39. |
| /PP <handle #> [16 Bits] |
| Read/Write Power supply max power capacity in Type 39. |
| /PC <handle #> [16 Bits] |
| Read/Write Power supply characteristics in Type 39. |
| /PVH <handle #> [16 Bits] |
| Read/Write Power supply voltage probe handle in Type 39. |
| /PDH <handle #> [16 Bits] |
| Read/Write Power supply cooling dev. handle in Type 39. |
| /PCH <handle #> [16 Bits] |
| Read/Write Power supply current probe handle in Type 39. |
+---------------------------------------------------------------------------+
| 1. The expression enclosed by <> means it is a mandatory field. |
| 2. The expression enclosed by [] means it is an optional field. |
| 3. A command without parameter means it is a read command. |
| 4. A command with necessary parameter means it is a write command. |
| 5. The format of BIOS release date is "mm/dd/yyyy". |
+---------------------------------------------------------------------------+
The important bits are:
/SM System manufacturer
/SP System product
/SV System version
/SS System Serial
/SK System SKU number
/SF System family
/BM Baseboard manufacturer
/BP Baseboard product
/BV Baseboard version
/BS Baseboard Serial
/CM Chassis manufacture
/CV Chassis version
/CS Chassis Serial
/CSK Chassis SKU
So if you want to fix the System SKU, you would run:
FS0:\> amideefix64.efi /SK "MSN2700"
Verifying the Changes
The easiest way to verify is dmidecode
.
You want to check:
dmidecode -t1
dmidecode -t2
dmidecode -t3
Known good values
SN2700
cumulus@cumulus:mgmt:~$ sudo dmidecode -t1
# dmidecode 3.4
Getting SMBIOS data from sysfs.
SMBIOS 2.7 present.
Handle 0x0001, DMI type 1, 27 bytes
System Information
Manufacturer: Mellanox Technologies Ltd.
Product Name: MSN2700
Version: B7
Serial Number: MT2025X12345
UUID: 814ffecc-b42c-11ea-8000-1c34daab4740
Wake-up Type: Power Switch
SKU Number: MSN2700
Family: Not Specified
cumulus@cumulus:mgmt:~$ sudo dmidecode -t2
# dmidecode 3.4
Getting SMBIOS data from sysfs.
SMBIOS 2.7 present.
Handle 0x0002, DMI type 2, 15 bytes
Base Board Information
Manufacturer: Mellanox Technologies Ltd.
Product Name: VMOD0001
Version: A2
Serial Number: MT2019X12345
Asset Tag: Not Specified
Features:
Board is a hosting board
Board is replaceable
Location In Chassis: Not Specified
Chassis Handle: 0x0003
Type: Motherboard
Contained Object Handles: 0
SN3700C
cumulus@cumulus:mgmt:~$ sudo dmidecode -t1
[sudo] password for cumulus:
# dmidecode 3.4
Getting SMBIOS data from sysfs.
SMBIOS 3.1.1 present.
Handle 0x0001, DMI type 1, 27 bytes
System Information
Manufacturer: Mellanox Technologies Ltd.
Product Name: MSN3700C
Version: A4
Serial Number: M1NJ23H456G
UUID: 7812d1b8-36bb-11ed-8000-900a84a89c00
Wake-up Type: Power Switch
SKU Number: HI116
Family: Not Specified
cumulus@cumulus:mgmt:~$ sudo dmidecode -t2
# dmidecode 3.4
Getting SMBIOS data from sysfs.
SMBIOS 3.1.1 present.
Handle 0x0002, DMI type 2, 15 bytes
Base Board Information
Manufacturer: Mellanox Technologies Ltd.
Product Name: VMOD0005
Version: A4
Serial Number: MT2230J12345
Asset Tag: Not Specified
Features:
Board is a hosting board
Board is replaceable
Location In Chassis: Not Specified
Chassis Handle: 0x0003
Type: Motherboard
Contained Object Handles: 0
cumulus@cumulus:mgmt:~$ sudo dmidecode -t3
# dmidecode 3.4
Getting SMBIOS data from sysfs.
SMBIOS 3.1.1 present.
Handle 0x0003, DMI type 3, 22 bytes
Chassis Information
Manufacturer: Mellanox Technologies Ltd.
Type: Rack Mount Chassis
Lock: Not Present
Version: A4
Serial Number: M1NJ23H456G
Asset Tag: Not Specified
Boot-up State: Safe
Power Supply State: Safe
Thermal State: Safe
Security Status: None
OEM Information: 0x00000000
Height: Unspecified
Number Of Power Cords: 1
Contained Elements: 0
SKU Number: MSN3700C
HPE SN3700cM
HPE SN3700cM come with SKU Number: MSN3700C
in DMI type 1
You have to correct this with
amideefix64.efi /SK HI116
Otherwise, the ASIC temperature sensor will not be initialized correctly (sysfs path /var/run/hw-management/thermal/asic1
will not be present), leading to issues with the thermal control loop.
cumulus@cumulus:mgmt:~$ sudo dmidecode -t1 | grep SKU
SKU Number: MSN3700C
cumulus@cumulus:mgmt:~$ sudo smonctl | grep "Asic Temp Sensor"
Temp4 (Asic Temp Sensor ): BAD
This is probably due to Mellanox hw-mgmt /usr/bin/hw_management_sync.py L180ff only containing cases for SKU HI112|HI116|HI136
but not MSN3700C
.
|
|