Products Overview

Performance test tools

To test RSSD UDisk performance, the fio (opens in a new tab) tool is recommended. It is recommended to use the libaio engine for testing.

Installation method

Linux: yum install -y fio.x86_64

fio parameters

-direct=1Ignore the cache and write directly to disk
-iodepth=128The depth of the requested IO queue
-rw=writeRead/Write Policy, optional values randread, randwrite, read, write, randrw
-ioengine=libaioFor I/O engine configuration, we recommend that you use libaio
-bs=4kI/O Block size configuration, 4k, 8k, 16k, etc. can be used
-size=200GThe size of file to be test
-numjobs=1Thread count configuration
-runtime=1000The duration of the test, in seconds
-group_reportingThe test results are displayed in a summary manner
-name=testName of the test task
-filename=/data/testThe path and file name of the test output

Common test case

  • Latency performance test
read latency: 
fio -direct=1 -iodepth=1 -rw=read -ioengine=libaio -bs=4k -size=200G -numjobs=1 -runtime=1000 -group_reporting -name=test -filename=/data/test  

write latency: 
fio -direct=1 -iodepth=1 -rw=write -ioengine=libaio -bs=4k -size=200G -numjobs=1 -runtime=1000 -group_reporting -name=test -filename=/data/test
  • Throughput performance test
read throughput: 
fio -direct=1 -iodepth=32 -rw=read -ioengine=libaio -bs=256k -size=200G -numjobs=4 -runtime=1000 -group_reporting -name=test -filename=/data/test 

write throughput:  
fio -direct=1 -iodepth=32 -rw=write -ioengine=libaio -bs=256k -size=200G -numjobs=4 -runtime=1000 -group_reporting -name=test -filename=/data/test
  • IOPS performance test 4k, 4*32 io-depth, random readwrite
read IOPS: 
fio -direct=1 -iodepth=32 -rw=randread  -ioengine=libaio -bs=4k -size=200G -numjobs=4 -runtime=1000 -group_reporting -name=test -filename=/data/test 

write IOPS:   
fio -direct=1 -iodepth=32 -rw=randwrite -ioengine=libaio -bs=4k -size=200G -numjobs=4 -runtime=1000 -group_reporting -name=test -filename=/data/test

RSSD UDisk performance test

When testing the performance of a disk, the disk itself and the stress test conditions play an important role. To maximize the performance of a multi-core and multi-concurrency system and stress test the performance of 1.2 million IOPS of an RSSD UDisk, you can refer to the following scripts:

numjobs=16          # thread count,not exceed CPU cores, default 16

if [[ $# == 0 ]]; then
  echo "Default test: `basename $0` $numjobs $iodepth $bs $rw $dev_name"
  echo "Or you can specify paramter:"
  echo "`basename $0` numjobs iodepth bs rw dev_name"
elif [[ $# == 5 ]]; then
  echo "paramter number error!"
  echo "`basename $0` numjobs iodepth bs rw dev_name"
  exit 1

nr_cpus=`cat /proc/cpuinfo |grep "processor" |wc -l`
if [ $nr_cpus -lt $numjobs ];then
  echo "Numjobs is more than cpu cores, exit!"
  exit -1
for ((i=1;i<10;i++))
  list=`cat /sys/block/${dev_name}/mq/*/cpu_list | awk '{if(i<=NF) print $i;}' i="$i" | tr -d ',' | tr '\n' ','`
  if [ -z $list ];then
spincpu=`echo $cpulist | cut -d ',' -f 2-${nu}` # cannot use 0 core
echo $spincpu
echo $numjobs
echo 2 > /sys/block/${dev_name}/queue/rq_affinity
sleep 5
# execute fio command
fio --ioengine=libaio --runtime=30s --numjobs=${numjobs} --iodepth=${iodepth} --bs=${bs} --rw=${rw} --filename=/dev/${dev_name} --time_based=1 --direct=1 --name=test --group_reporting --cpus_allowed=$spincpu --cpus_allowed_policy=split

Test Instructions

  1. Depending on the user's test environment, you can specify the input parameters of the script, and if you don't specify it, the default test method will be executed.

  2. Testing the raw disk directly will damage the file system structure. If data is already available on the disk, you can set filename=[specific file path, for example, /mnt/test.image]. If you don't have any data, you can directly set filename=[device name, as in this example, /dev/vdb]

Script interpretation

Block device parameters

When testing the instance, the command echo 2 > /sys/block/vdb/queue/rq_affinity in the script is to modify the parameter rq_affinity value in the block device in the virtual machine instance to 2.

If the value of parameter rq_affinity is 1, it means that when the block device receives an I/O Completion event, the I/O is sent back to the group where the vCPU that handles the I/O delivery process is located. In the case of multi-threaded concurrency, I/O completion may be executed on a single vCPU, which will cause bottlenecks and prevent performance improvement.

If the value of parameter rq_affinity is 2, it means that when the block device receives the I/O Completion event, the I/O will be executed on the vCPU that was originally delivered. In the case of multi-threaded concurrency, the performance of each vCPU can be fully utilized.

Bind the corresponding vCPU

In normal mode, a device has only one request list (Request-Queue). In the case of multi-threaded concurrent I/O, this single Request-Queue is a performance bottleneck.

In the latest multi-queue mode, a device can have multiple request-queues for I/O, giving full play to the performance of back-end storage. If you have four I/O threads, you need to bind the four threads to the CPU cores corresponding to different request-queues so that you can take advantage of the multi-queue to improve performance.

fio provides the parameter cpusallowed and cpus_allowed_policy to bind the vCPU. For example, run ls /sys/block/vdb/mq/ to check the QueueId of the vdb disk, and run cat /sys/block/vdb/mq/$QueueId/cpu_list to check the cpu_core_id to which the QueueId of the vdb disk is bound.

  • Company
  • ContactUs
  • Blog
Copyright © 2024 SurferCloud All Rights Reserved
  • Contact our sales to help your business.
  • Contact our sales to help your business.