How To Measure IO Speed Using Fio

Written with the help of Stephen Wanyee.

Fio (Flexible I/O) is a flexible command-line tool for measuring disk I/O performance. Its flexibility allows you to customise your I/O tests in order to measure very specific aspects of I/O performance. You can choose, for example, to perform sequential or random read/write operations, to use buffered or direct I/O, the I/O algorithm to be used (e.g on Linux, there’s read(2)/write(2), io_uring, etc) and even the CPUs that should be used. Many other parameters can be set.

The most common categories of these parameters are:

I/O type params - to set the kind of I/O to perform, eg, read, write or both, sequential or random and buffered or direct.
I/O engine params - to set the IO algorithm to be used, eg, read(2)/write(2), io_uring
file size param - to set the size of the file to read or write
block params - to set the properties of the chunk or block a single I/O request will read/write

Use Case: Test Sequential Read Speed of a Disk

A simple Fio use case is approximating the sequential read speed of a disk. We can use the command below for this. It will create a 1GiB file (if it doesn’t already exist) and read it sequentially in 4KiB blocks (therefore, issuing 1GiB/4KiB = 262,144 read requests). Fio will report the performance in IOPS (the rate of processing the issued read requests - 262,144/T) and in bandwidth (the rate at which the file data is read - 1GiB/T), where T is the time it takes to read the file.

fio --readwrite=read --size=1g --blocksize=4k --ioengine=sync --name=test-read-io

--readwrite=read says we want to read a file sequentially
--size=1g says we want to read a 1GiB file. (Note that Fio will create the 1GiB file, so ensure you have enough free space. If you don’t, you can specify a smaller file size like --size=500m for a 500MB file)
--blocksize=4k says we want to read the file in 4KiB blocks
--ioengine=sync says we want to issue read requests synchronously
--name=test-read-io names the job. A job is the whole operation, in this case, of reading the 1GiB file sequentially, synchronously and in 4KiB blocks.

The final output will take this format.

>>>> fio --readwrite=read --size=1g --blocksize=4k --ioengine=sync --name=test-read-io

1.  test-read-io: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=1
2.  fio-3.33
3.  Starting 1 process
4.  Jobs: 1 (f=1): [R(1)][100.0%][r=58.9MiB/s][r=15.1k IOPS][eta 00m:00s]
5.  test-read-io: (groupid=0, jobs=1): err= 0: pid=6721: Thu Dec 22 05:39:38 2022
6.    read: IOPS=16.8k, BW=65.6MiB/s (68.7MB/s)(1024MiB/15620msec)
7.      clat (usec): min=2, max=396773, avg=58.05, stdev=922.46
8.       lat (usec): min=2, max=396774, avg=58.24, stdev=922.46
9.      clat percentiles (usec):
10.      |  1.00th=[    3],  5.00th=[    4], 10.00th=[    5], 20.00th=[   10],
11.      | 30.00th=[   10], 40.00th=[   10], 50.00th=[   10], 60.00th=[   10],
12.      | 70.00th=[   10], 80.00th=[   11], 90.00th=[   12], 95.00th=[   15],
13.      | 99.00th=[ 2442], 99.50th=[ 3097], 99.90th=[ 3949], 99.95th=[ 4293],
14.      | 99.99th=[14484]
15.    bw (  KiB/s): min= 9960, max=79201, per=99.98%, avg=67117.23, stdev=12758.32, samples=31
16.    iops        : min= 2490, max=19798, avg=16779.06, stdev=3189.45, samples=31
17.   lat (usec)   : 4=8.77%, 10=69.96%, 20=18.75%, 50=0.74%, 100=0.17%
18.   lat (usec)   : 250=0.02%, 500=0.03%, 750=0.01%, 1000=0.01%
19.   lat (msec)   : 2=0.07%, 4=1.39%, 10=0.07%, 20=0.01%, 50=0.01%
20.   lat (msec)   : 100=0.01%, 500=0.01%
21.   cpu          : usr=9.78%, sys=18.17%, ctx=4211, majf=0, minf=14
22.   IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
23.      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
24.      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
25.      issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
26.      latency   : target=0, window=0, percentile=100.00%, depth=1
27.
28. Run status group 0 (all jobs):
29.    READ: bw=65.6MiB/s (68.7MB/s), 65.6MiB/s-65.6MiB/s (68.7MB/s-68.7MB/s), io=1024MiB (1074MB), run=15620-15620msec
30.
31. Disk stats (read/write):
32.   sda: ios=4156/199, merge=27/136, ticks=28658/2823, in_queue=23080, util=99.50%

The data I wanted - an approximation of the sequential read speed of my disk - is in line 6: 16,800 IOPS and 65.6MiB/s (BW). However, because of how the job is run internally by the operating system, we might not be getting what we asked for. Let me explain.

Line 25 of the output, labelled issued rwts, gives data about the issued read, write, trims and synchronise requests (we only care about the read requests). The first tuple on that line, labelled total, tells us the that total issued read requests was 262,144, which is what we expected, as mentioned earlier.

But now look at the section on lines 31 and 32 labelled Disk stats (read/write). This section gives disk data as pairs of values in the format r/w - r concerns reads and w, writes. The first stat, labelled ios, gives the number of I/O operations performed by the disk. It tells us that only 4,156 read operations were performed by the disk - even though the job issued 262,144 read requests. (Subsequent runs of the same Fio command yielded the values 4117, 4130, and 4124 as the number of read operations performed by the disk while the number of read requests issued remained constant at 262,144) There is clearly some mysterious activity happening between the issuing of read requests and their getting processed by the disk.

The culprit is the operating system kernel. The kernel sits between user-space programs (like Fio) and I/O devices (like the disk); I/O requests from user-space go through the kernel before getting to I/O devices. But, by default, the kernel doesn’t simply forward the I/O requests. It first optimises them to minimise the use of I/O devices (because of their relative slowness). In this case, the kernel reduces the 262,144 read requests issued to 4,156 disk operations.

The optimisation at play here is “readahead”. Readahead reads more data than requested by a user-space request (when the request asks for a suboptimally small amount of data) and caches it in memory so that subsequent read requests are served from the cache and don’t reach the disk. You can tell that, in our case, the kernel probably reads the file in 256 KiB blocks (1GiB/4156 disk read ops ≈ 256KiB).

The point is, your Fio command may not be run as you expect. The kernel’s I/O optimisations might interfere with your Fio tests.

To minimise the interference of the kernel we can use direct I/O by setting Fio’s direct parameter to true (--direct=1). With direct I/O, there will be no readahead or caching and all I/O requests will reach the disk.

Running the same Fio command but with --direct=1 yields the following output.

>>>> fio --direct=1 --readwrite=read --ioengine=sync --size=1g --blocksize=4k --name=test-read-io

...
6.   read: IOPS=14.9k, BW=58.3MiB/s (61.2MB/s)(1024MiB/17552msec)
...
...
25.  issued rwts: total=262144,0,0,0 short=0,0,0,0 dropped=0,0,0,0
...
...
31.  Disk stats (read/write):
32.    sda: ios=258171/161, ...

Notice that now the number of read operations performed by the disk (258,171) is about the same as the number of read requests Fio issued (262,144). (Subsequent runs of the same Fio command yielded the values 261,136, 261,273 and 262,084 for the number of read operations performed by the disk)