fio gives an unexpected IOPS number

While running tests with fio, we encountered unexpected behavior of it.Test config:

[global]iodepth=64direct=1ioengine=libaiogroup_reportingtime_basedruntime=6000numjobs=1rw=randrwwrite_lat_log=test1log_avg_msec=1000write_iops_log=test1iopsavgtime=1000disable_slat=1disable_clat=1log_unix_epoch=1[job1]filename=/dev/sdcbs=8krate_iops=10k,90k

Test configuration:

SAN storage, iSCSI san configured
x64 server running Vmware ESXi, on which a Linux virtual machine deployed.
the server has 2 x 25Gb ethernet links (4 paths active), ESXi multipathing connfigured to round-robin with iops=1
LUN mapped to the virtual machine - /dev/sdc block device.

We performed stress tests that resulted in storage performance degradation. While the storage worked normally, fio correctly generated a 90k IOPS write load. During storage degradation we observed fio decreasing IOPS level. Apparently, it happened due to the growth of operations in the device queue. fio reduced the number of IO operations to avoid exceeding the iodepth value. The strange thing started after the storage was recovered. fio increased IOPS up to the maximum possible level and kept such bandwidth until the average IOPS counting from the moment of degradation leveled off to a value equal to rate_iops. This is a completely unexpected scenario for us, because the description of rate_iops clearly says that this parameter limits the bandwidth to the specified IOPS number.We set direct=1, buffering on the file system/operating system side should not affect IOPS. Moreover, we checked iostat and esxtop and did not see any queueing before IOPS started to rise. This proves that it was fio increasing IOPS. The graph to demostrate the issueOn the picture you can see 90k writes before the test, then IO dropped for a while and then - stabilization period during which the storage performance degraded. Then the storage recovered and you can see that IOPS raised to ~110k for some time and after that it finally lined up at 90k "normal" IOPS level.

We checked several times and confirmed, the increased IOPS level lasts exactly as long as it takes for the average IOPS value to become equal to the rate_iops value.I assume that we are misunderstanding something in the fio configuration. Or maybe it is a defect in the code, because it is not written in the documentation that fio may generate an IOPS number higher than it is configured in rate_iops. So any advice or idea will be very much appreciated

Latest Images

Trending Articles

Latest Images