Performance test tools
To test RSSD UDisk performance, the fio (opens in a new tab) tool is recommended. It is recommended to use the libaio engine for testing.
Installation method
Linux: yum install -y fio.x86_64
fio parameters
parameter | explanation |
---|---|
-direct=1 | Ignore the cache and write directly to disk |
-iodepth=128 | The depth of the requested IO queue |
-rw=write | Read/Write Policy, optional values randread, randwrite, read, write, randrw |
-ioengine=libaio | For I/O engine configuration, we recommend that you use libaio |
-bs=4k | I/O Block size configuration, 4k, 8k, 16k, etc. can be used |
-size=200G | The size of file to be test |
-numjobs=1 | Thread count configuration |
-runtime=1000 | The duration of the test, in seconds |
-group_reporting | The test results are displayed in a summary manner |
-name=test | Name of the test task |
-filename=/data/test | The path and file name of the test output |
Common test case
- Latency performance test
read latency:
fio -direct=1 -iodepth=1 -rw=read -ioengine=libaio -bs=4k -size=200G -numjobs=1 -runtime=1000 -group_reporting -name=test -filename=/data/test
write latency:
fio -direct=1 -iodepth=1 -rw=write -ioengine=libaio -bs=4k -size=200G -numjobs=1 -runtime=1000 -group_reporting -name=test -filename=/data/test
- Throughput performance test
read throughput:
fio -direct=1 -iodepth=32 -rw=read -ioengine=libaio -bs=256k -size=200G -numjobs=4 -runtime=1000 -group_reporting -name=test -filename=/data/test
write throughput:
fio -direct=1 -iodepth=32 -rw=write -ioengine=libaio -bs=256k -size=200G -numjobs=4 -runtime=1000 -group_reporting -name=test -filename=/data/test
- IOPS performance test
4k, 4*32 io-depth, random readwrite
read IOPS:
fio -direct=1 -iodepth=32 -rw=randread -ioengine=libaio -bs=4k -size=200G -numjobs=4 -runtime=1000 -group_reporting -name=test -filename=/data/test
write IOPS:
fio -direct=1 -iodepth=32 -rw=randwrite -ioengine=libaio -bs=4k -size=200G -numjobs=4 -runtime=1000 -group_reporting -name=test -filename=/data/test
RSSD UDisk performance test
When testing the performance of a disk, the disk itself and the stress test conditions play an important role. To maximize the performance of a multi-core and multi-concurrency system and stress test the performance of 1.2 million IOPS of an RSSD UDisk, you can refer to the following rssd_test.sh scripts:
#!/bin/bash
numjobs=16 # thread count,not exceed CPU cores, default 16
iodepth=32
bs=4k
rw=randread
dev_name=vdb
if [[ $# == 0 ]]; then
echo "Default test: `basename $0` $numjobs $iodepth $bs $rw $dev_name"
echo "Or you can specify paramter:"
echo "`basename $0` numjobs iodepth bs rw dev_name"
elif [[ $# == 5 ]]; then
numjobs=$1
iodepth=$2
bs=$3
rw=$4
dev_name=$5
else
echo "paramter number error!"
echo "`basename $0` numjobs iodepth bs rw dev_name"
exit 1
fi
nr_cpus=`cat /proc/cpuinfo |grep "processor" |wc -l`
if [ $nr_cpus -lt $numjobs ];then
echo "Numjobs is more than cpu cores, exit!"
exit -1
fi
nu=$((numjobs+1))
cpulist=""
for ((i=1;i<10;i++))
do
list=`cat /sys/block/${dev_name}/mq/*/cpu_list | awk '{if(i<=NF) print $i;}' i="$i" | tr -d ',' | tr '\n' ','`
if [ -z $list ];then
break
fi
cpulist=${cpulist}${list}
done
spincpu=`echo $cpulist | cut -d ',' -f 2-${nu}` # cannot use 0 core
echo $spincpu
echo $numjobs
echo 2 > /sys/block/${dev_name}/queue/rq_affinity
sleep 5
# execute fio command
fio --ioengine=libaio --runtime=30s --numjobs=${numjobs} --iodepth=${iodepth} --bs=${bs} --rw=${rw} --filename=/dev/${dev_name} --time_based=1 --direct=1 --name=test --group_reporting --cpus_allowed=$spincpu --cpus_allowed_policy=split
Test Instructions
-
Depending on the user's test environment, you can specify the input parameters of the script, and if you don't specify it, the default test method will be executed.
-
Testing the raw disk directly will damage the file system structure. If data is already available on the disk, you can set filename=[specific file path, for example, /mnt/test.image]. If you don't have any data, you can directly set filename=[device name, as in this example, /dev/vdb]
Script interpretation
Block device parameters
When testing the instance, the command echo 2 > /sys/block/vdb/queue/rq_affinity
in the script is to modify the parameter rq_affinity value in the block device in the virtual machine instance to 2.
If the value of parameter rq_affinity
is 1, it means that when the block device receives an I/O Completion event, the I/O is sent back to the group where the vCPU that handles the I/O delivery process is located. In the case of multi-threaded concurrency, I/O completion may be executed on a single vCPU, which will cause bottlenecks and prevent performance improvement.
If the value of parameter rq_affinity
is 2, it means that when the block device receives the I/O Completion event, the I/O will be executed on the vCPU that was originally delivered. In the case of multi-threaded concurrency, the performance of each vCPU can be fully utilized.
Bind the corresponding vCPU
In normal mode, a device has only one request list (Request-Queue). In the case of multi-threaded concurrent I/O, this single Request-Queue is a performance bottleneck.
In the latest multi-queue mode, a device can have multiple request-queues for I/O, giving full play to the performance of back-end storage. If you have four I/O threads, you need to bind the four threads to the CPU cores corresponding to different request-queues so that you can take advantage of the multi-queue to improve performance.
fio provides the parameter cpusallowed
and cpus_allowed_policy
to bind the vCPU. For example, run ls /sys/block/vdb/mq/
to check the QueueId of the vdb disk, and run cat /sys/block/vdb/mq/$QueueId/cpu_list
to check the cpu_core_id to which the QueueId of the vdb disk is bound.