配置基于RDMA的NFS服务
环境信息
- OS:
Ubuntu 18.04 LTS
- 内核:
5.4.0-84-generic
- 网卡:
Mellanox ConnectX-3 Pro(MT27520)
系统通用配置
- 使用阿里云APT镜像源
# cat >/etc/apt/sources.list<<EOF
deb http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-security main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-updates main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-backports main restricted universe multiverse
deb http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse
deb-src http://mirrors.aliyun.com/ubuntu/ bionic-proposed main restricted universe multiverse
EOF
APT镜像源更新
# apt update -y
- 内核参数配置
# cat >/etc/sysctl.d/99-common.conf<<EOF
fs.inotify.max_queued_events = 1048576
fs.inotify.max_user_instances = 1048576
fs.inotify.max_user_watches = 1048576
vm.max_map_count = 262144
EOF
避免出现因文件描述符数量不足导致服务异常。
NFS服务端
服务配置
安装软件包
# apt -y install nfs-kernel-server nfs-common
增加NFS服务线程,编辑/etc/default/nfs-kernel-server
文件,找到RPCNFSDCOUNT
项,比如改成16
RPCNFSDCOUNT=16
...<省略若干行>...
配置NFS服务,比如将/data目录共享出去
# cat >/etc/exports<<EOF
/data *(rw,async,crossmnt,insecure,fsid=0,no_auth_nlm,no_subtree_check,no_root_squash,no_all_squash)
EOF
# exportfs -av
exporting *:/data
重启服务
# systemctl restart nfs-server
# systemctl enable nfs-server
开启RDMA协议
加载rdma内核模块
# modprobe xprtrdma # 服务端
# modprobe svcrdma # 客户端
指定服务端监听 RDMA 传输端口。
# echo 'rdma 20049' | tee /proc/fs/nfsd/portlist
# cat /proc/fs/nfsd/portlist
rdma 20049
rdma 20049
udp 2049
tcp 2049
udp 2049
tcp 2049
NFS systemd 整合RDMA,编辑/lib/systemd/system/nfs-server.service
文件
[Unit]
Description=NFS server and services
DefaultDependencies=no
Requires=network.target proc-fs-nfsd.mount
Requires=nfs-mountd.service
Wants=rpcbind.socket
Wants=nfs-idmapd.service
After=local-fs.target
After=network.target proc-fs-nfsd.mount rpcbind.socket nfs-mountd.service
After=nfs-idmapd.service rpc-statd.service
Before=rpc-statd-notify.service
# GSS services dependencies and ordering
Wants=auth-rpcgss-module.service
After=rpc-gssd.service rpc-svcgssd.service
# start/stop server before/after client
Before=remote-fs-pre.target
Wants=nfs-config.service
After=nfs-config.service
[Service]
EnvironmentFile=-/run/sysconfig/nfs-utils
Type=oneshot
RemainAfterExit=yes
ExecStartPre=/sbin/modprobe xprtrdma
ExecStartPre=/sbin/modprobe svcrdma
ExecStartPre=/usr/sbin/exportfs -r
ExecStart=/usr/sbin/rpc.nfsd $RPCNFSDARGS
ExecStartPost=/bin/bash -c "sleep 3 && echo 'rdma 20049' | tee /proc/fs/nfsd/portlist"
ExecStop=/usr/sbin/rpc.nfsd 0
ExecStopPost=/usr/sbin/exportfs -au
ExecStopPost=/usr/sbin/exportfs -f
ExecReload=/usr/sbin/exportfs -r
[Install]
WantedBy=multi-user.target
Tips: 新增29,30,33行,这样就不需要手动去加载内核模块与配置RDMA传输端口了。
NFS客户端
服务配置
安装软件包
# apt -y install nfs-common
配置内核参数
# echo 'sunrpc.tcp_slot_table_entries = 128' >/etc/sysctl.d/99-sunrpc.conf
sysctl --system
加载rdma内核模块
# modprobe svcrdma # 客户端
手动挂载
# mount -t nfs 192.168.200.35:/ /mnt/ -o vers=4.1,_netdev,rdma,port=20049,hard,intr,noatime,nodiratime,async,nolock,noacl,sec=sys,noresvport
查看挂载
# mount |grep nfs
nfsd on /proc/fs/nfsd type nfsd (rw,relatime)
192.168.200.35:/ on /mnt type nfs4 (rw,noatime,nodiratime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,noresvport,proto=rdma,port=20049,timeo=600,retrans=2,sec=sys,clientaddr=192.168.200.35,local_lock=none,addr=192.168.200.35,_netdev)
可以看到客户端与服务端使用的是RDMA协议。
FIO压测
使用fio简单测试个随机读
# fio --rw=randread --bs=64k --numjobs=4 --iodepth=8 --runtime=30 --time_based --loops=1 --ioengine=libaio --direct=1 --invalidate=1--fsync_on_close=1 --randrepeat=1 --norandommap --exitall --name task1 --filename=/mnt/1.txt --size=10000000
task1: (g=0): rw=randread, bs=(R) 64.0KiB-64.0KiB, (W) 64.0KiB-64.0KiB, (T) 64.0KiB-64.0KiB, ioengine=libaio, iodepth=8
...
fio-3.1
Starting 4 processes
task1: Laying out IO file (1 file / 9MiB)
fio: native_fallocate call failed: Operation not supported
Jobs: 4 (f=4): [r(4)][100.0%][r=4082MiB/s,w=0KiB/s][r=65.3k,w=0 IOPS][eta 00m:00s]
task1: (groupid=0, jobs=1): err= 0: pid=19205: Sun Nov 28 17:55:08 2021
read: IOPS=15.9k, BW=995MiB/s (1043MB/s)(29.2GiB/30001msec)
slat (usec): min=4, max=17475, avg=22.44, stdev=104.78
clat (usec): min=3, max=24386, avg=477.53, stdev=656.88
lat (usec): min=90, max=24421, avg=500.32, stdev=670.66
clat percentiles (usec):
| 1.00th=[ 143], 5.00th=[ 192], 10.00th=[ 217], 20.00th=[ 260],
| 30.00th=[ 297], 40.00th=[ 334], 50.00th=[ 367], 60.00th=[ 404],
| 70.00th=[ 453], 80.00th=[ 506], 90.00th=[ 627], 95.00th=[ 898],
| 99.00th=[ 3392], 99.50th=[ 4883], 99.90th=[ 8455], 99.95th=[10552],
| 99.99th=[16909]
bw ( KiB/s): min=616064, max=1526272, per=23.83%, avg=1019563.92, stdev=227595.64, samples=60
iops : min= 9626, max=23848, avg=15930.52, stdev=3556.21, samples=60
lat (usec) : 4=0.01%, 50=0.01%, 100=0.06%, 250=17.95%, 500=60.88%
lat (usec) : 750=14.40%, 1000=2.41%
lat (msec) : 2=2.37%, 4=1.19%, 10=0.68%, 20=0.06%, 50=0.01%
cpu : usr=3.84%, sys=25.77%, ctx=384889, majf=0, minf=995
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=477658,0,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=8
task1: (groupid=0, jobs=1): err= 0: pid=19206: Sun Nov 28 17:55:08 2021
read: IOPS=15.3k, BW=957MiB/s (1004MB/s)(28.0GiB/30001msec)
slat (usec): min=4, max=12737, avg=22.72, stdev=108.28
clat (usec): min=39, max=27395, avg=496.88, stdev=694.83
lat (usec): min=102, max=27421, avg=519.95, stdev=709.23
clat percentiles (usec):
| 1.00th=[ 149], 5.00th=[ 196], 10.00th=[ 221], 20.00th=[ 265],
| 30.00th=[ 306], 40.00th=[ 343], 50.00th=[ 375], 60.00th=[ 416],
| 70.00th=[ 465], 80.00th=[ 519], 90.00th=[ 652], 95.00th=[ 971],
| 99.00th=[ 3523], 99.50th=[ 4883], 99.90th=[ 9241], 99.95th=[11338],
| 99.99th=[16450]
bw ( KiB/s): min=600832, max=1629184, per=22.82%, avg=976299.95, stdev=228428.99, samples=59
iops : min= 9388, max=25456, avg=15254.64, stdev=3569.24, samples=59
lat (usec) : 50=0.01%, 100=0.07%, 250=16.68%, 500=60.13%, 750=15.62%
lat (usec) : 1000=2.66%
lat (msec) : 2=2.63%, 4=1.45%, 10=0.67%, 20=0.08%, 50=0.01%
cpu : usr=3.76%, sys=24.45%, ctx=386910, majf=0, minf=573
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=459455,0,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=8
task1: (groupid=0, jobs=1): err= 0: pid=19207: Sun Nov 28 17:55:08 2021
read: IOPS=20.4k, BW=1274MiB/s (1335MB/s)(37.3GiB/30001msec)
slat (usec): min=4, max=11251, avg=19.39, stdev=59.26
clat (nsec): min=1970, max=26154k, avg=371278.12, stdev=417542.18
lat (usec): min=91, max=26165, avg=390.93, stdev=425.22
clat percentiles (usec):
| 1.00th=[ 126], 5.00th=[ 165], 10.00th=[ 192], 20.00th=[ 225],
| 30.00th=[ 253], 40.00th=[ 289], 50.00th=[ 318], 60.00th=[ 355],
| 70.00th=[ 388], 80.00th=[ 441], 90.00th=[ 519], 95.00th=[ 611],
| 99.00th=[ 1434], 99.50th=[ 2606], 99.90th=[ 6390], 99.95th=[ 8225],
| 99.99th=[12780]
bw ( MiB/s): min= 765, max= 1805, per=30.48%, avg=1273.51, stdev=272.90, samples=60
iops : min=12244, max=28892, avg=20376.03, stdev=4366.52, samples=60
lat (usec) : 2=0.01%, 4=0.01%, 10=0.01%, 50=0.01%, 100=0.04%
lat (usec) : 250=29.19%, 500=58.89%, 750=9.12%, 1000=1.13%
lat (msec) : 2=0.93%, 4=0.43%, 10=0.23%, 20=0.03%, 50=0.01%
cpu : usr=4.73%, sys=35.67%, ctx=425780, majf=0, minf=1561
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=611362,0,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=8
task1: (groupid=0, jobs=1): err= 0: pid=19208: Sun Nov 28 17:55:08 2021
read: IOPS=15.2k, BW=952MiB/s (999MB/s)(27.9GiB/30000msec)
slat (usec): min=3, max=16154, avg=22.48, stdev=100.97
clat (nsec): min=1120, max=39989k, avg=500028.41, stdev=697938.47
lat (usec): min=97, max=39996, avg=522.87, stdev=710.57
clat percentiles (usec):
| 1.00th=[ 155], 5.00th=[ 202], 10.00th=[ 229], 20.00th=[ 273],
| 30.00th=[ 310], 40.00th=[ 343], 50.00th=[ 375], 60.00th=[ 416],
| 70.00th=[ 465], 80.00th=[ 519], 90.00th=[ 652], 95.00th=[ 963],
| 99.00th=[ 3687], 99.50th=[ 5276], 99.90th=[ 8455], 99.95th=[11076],
| 99.99th=[16188]
bw ( KiB/s): min=621787, max=1551647, per=22.82%, avg=976172.42, stdev=181730.02, samples=60
iops : min= 9715, max=24244, avg=15252.50, stdev=2839.44, samples=60
lat (usec) : 2=0.01%, 50=0.01%, 100=0.06%, 250=15.14%, 500=61.80%
lat (usec) : 750=15.53%, 1000=2.70%
lat (msec) : 2=2.54%, 4=1.37%, 10=0.79%, 20=0.07%, 50=0.01%
cpu : usr=3.66%, sys=24.38%, ctx=375132, majf=0, minf=995
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.1%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwt: total=457125,0,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=8
Run status group 0 (all jobs):
READ: bw=4178MiB/s (4381MB/s), 952MiB/s-1274MiB/s (999MB/s-1335MB/s), io=122GiB (131GB), run=30000-30001msec