Skip to main content

Install Mellanox Network Adapter Drivers on Linux

·1024 words·5 mins
Mellanox Linux Device Driver
Table of Contents

In older versions of the Linux operating system (such as CentOS 7.4), some network cards may not be recognized, and drivers need to be installed manually.

Checking the Network Cards
#

As shown below, the BMC interface shows 7 network cards installed, with a total of 13 network ports.

You can also see 13 network ports in the system using the lspci command.

[root@localhost ~]# lspci | grep Mellanox
04:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
04:00.1 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
08:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
08:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
2d:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
5b:00.0 Ethernet controller: Mellanox Technologies MT28841
5b:00.1 Ethernet controller: Mellanox Technologies MT28841
5c:00.0 Ethernet controller: Mellanox Technologies MT28841
5c:00.1 Ethernet controller: Mellanox Technologies MT28841
96:00.0 Ethernet controller: Mellanox Technologies MT28841
96:00.1 Ethernet controller: Mellanox Technologies MT28841
97:00.0 Ethernet controller: Mellanox Technologies MT28841
97:00.1 Ethernet controller: Mellanox Technologies MT28841

However, when checking the network ports with a command, only 5 ens network ports are visible, with 8 missing. Drivers need to be installed manually.

[root@localhost ~]# ip add
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
2: ens19f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
3: ens19f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
4: ens21f0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
5: ens21f1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
6: ens12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
7: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN qlen 1000
8: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast master virbr0 state DOWN qlen 

Installing the Driver
#

Downloading the Driver
#

Driver download link:

https://developer.nvidia.com/networking/ethernet-software

The EN version only contains the network card driver, while the OFED version includes both the driver and some supporting software. It is recommended to download the OFED version.

Then, select the corresponding operating system and architecture to download the installation package.

Check the operating system version.

[root@localhost ~]# cat /etc/redhat-release 
CentOS Linux release 7.4.1708 (Core) 

Installing the EN Driver
#

Decompressing the Installation Package
#

[root@localhost Downloads]# ls
mlnx-en-23.10-3.2.2.0-rhel7.4-x86_64.tgz
[root@localhost Downloads]# tar -vxf mlnx-en-23.10-3.2.2.0-rhel7.4-x86_64.tgz 

Running the Installation Program
#

Enter the directory and execute the installation program.

[root@localhost Downloads]# cd mlnx-en-23.10-3.2.2.0-rhel7.4-x86_64/
[root@localhost mlnx-en-23.10-3.2.2.0-rhel7.4-x86_64]# ls
common_installers.pl  common.pl  create_mlnx_ofed_installers.pl  distro  install  is_kmp_compat.sh  LICENSE  mlnx_add_kernel_support.sh  RPM-GPG-KEY-Mellanox  RPMS  RPMS_ETH  src  uninstall.sh
[root@localhost mlnx-en-23.10-3.2.2.0-rhel7.4-x86_64]# ./install 
Logs dir: /tmp/mlnx-en.55110.logs
General log file: /tmp/mlnx-en.55110.logs/general.log
Verifying KMP rpms compatibility with target kernel...
This program will install the mlnx-en package on your machine.
Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed.
Those packages are removed due to conflicts with mlnx-en, do not reinstall them.

# Enter 'y' here
Do you want to continue?[y/N]:y

Reloading the New Driver
#

After the installation is complete, you need to execute the command prompted to reload the new driver.

/etc/init.d/mlnx-en.d restart

Note that after reloading the driver, the network card names will change, and the corresponding network card configuration files must also be modified.

# For example, ens19f0 becomes ens19f0np0
9: ens19f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000

Checking Network Card Information
#

Recheck the number of network cards, and you should see that it has returned to 13, which is normal.

9: ens19f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
10: ens19f1np1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
11: ens21f0np0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
12: ens21f1np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
13: ens12np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
14: ens13f0np0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
15: ens13f1np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
16: ens14f0np0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
17: ens14f1np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
18: ens15f0np0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
19: ens15f1np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
20: ens16f0np0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000
21: ens16f1np1: <NO-CARCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN qlen 1000

Installing the OFED Driver
#

After decompressing, execute the installation program.

[root@localhost ~]# tar -vxf MLNX_OFED_LINUX-23.10-3.2.2.0-rhel7.4-x86_64
[root@localhost ~]# cd MLNX_OFED_LINUX-23.10-3.2.2.0-rhel7.4-x86_64/ 
[root@localhost MLNX_OFED_LINUX-23.10-3.2.2.0-rhel7.4-x86_64]# ./mlnxofedinstall

If dependencies are missing, install them as prompted.

General log file: /tmp/MLNX_OFED_LINUX.74618.logs/general.log
Error: One or more required packages for installing MLNX_OFED_LINUX are missing.
Please install the missing packages using your Linux distribution Package Management tool.
Run:
yum install tcl tk

Then continue the installation and enter ‘y’.

[root@localhost MLNX_OFED_LINUX-23.10-3.2.2.0-rhel7.4-x86_64]# ./mlnxofedinstall 
Logs dir: /tmp/MLNX_OFED_LINUX.75823.logs
General log file: /tmp/MLNX_OFED_LINUX.75823.logs/general.log
Verifying KMP rpms compatibility with target kernel...
This program will install the MLNX_OFED_LINUX package on your machine.
Note that all other Mellanox, OEM, OFED, RDMA or Distribution IB packages will be removed.
Those packages are removed due to conflicts with MLNX_OFED_LINUX, do not reinstall them.

Do you want to continue?[y/N]:y

After the installation is complete, reload the driver in the same way.

Log File: /tmp/Fpnr9q8X6m
Real log file: /tmp/MLNX_OFED_LINUX.75823.logs/fw_update.log
Failed to update Firmware.
See /tmp/MLNX_OFED_LINUX.75823.logs/fw_update.log
To load the new driver, run:
/etc/init.d/openibd restart

Postscript
#

On Mellanox network cards, after executing the ip link set down command, the link light remains on, whereas it turns off for Intel network cards. After testing and troubleshooting, a summary of the process is recorded below:

The KEEP_ETH_LINK_UP configuration item of the network card needs to be turned off.

Mellanox’s MFT tool needs to be downloaded and installed in advance. It can be downloaded from the following URL: (https://www.mellanox.com/products/adapter-software/firmware-tools)

Here is the specific process:

[root@localhost ~]#mst start
[root@localhost ~]#mst status
[root@localhost ~]#mlxconfig –d /dev/mst/**** s KEEP_ETH_LINK_UP_P1=0 (where **** is the output of the previous command.)
[root@localhost ~]#mlxconfig –d /dev/mst/**** s KEEP_ETH_LINK_UP_P2=0
[root@localhost ~]#reboot

The reason is that the KEEP_ETH_LINK_UP configuration item on Mellanox network cards is enabled by default.

This configuration item ensures that the network card’s PHY maintains the link state even if there is no physical disconnection.

In later lab tests, it was confirmed that after turning off the KEEP_ETH_LINK_UP configuration, the link light goes out after executing the ip link set down command.

Related

CXL内存扩展卡在服务器上如何安装和配置
·82 words·1 min
CXL Memory Expansion
AMD 计划推出全新 Ryzen 200 系列APU
·49 words·1 min
AMD APU
VXLAN为什么需要EVPN
·218 words·2 mins
VXLAN EVPN