Disclaimer

Sunday, 27 February 2022

What is I/O wait and how does it affect Linux performance?

 I/O wait or iowait, wait, wa, %iowait, or wait% is often displayed by command-line Linux system monitoring tools such as top, sar, atop, and others. 

On its own, it’s one of many performance stats that provide us with an insight into Linux system performance.

I/O wait came up in a recent discussion , during our support call, they reported load spikes of 60 to 80 on their 32 CPU core system

This resulted in slow page loading, timeouts, and intermittent outages. The cause? Storage I/O bottleneck as hinted at first by a consistently high iowait, and then later confirmed with additional investigation.

What is I/O wait? How does I/O wait affect Linux server performance? How can we monitor and reduce I/O wait related issues? 


Continue reading for the answers to these questions.




Do you have an I/O bottleneck?

Your I/O wait measurement is the canary for an I/O bottleneck. I/O Wait is the percentage of time your processors are waiting on the disk.

For example, let’s say it takes 1 second to grab 10,000 rows from MySQL and perform some operations on those rows.


The disk is being accessed while the rows are retrieved. 

During this time the processor is idle. It’s waiting on the disk. 

In the example above, disk access took 700 ms, so I/O wait is 70%.

You can check your I/O wait percentage via top, a command available on every flavor of Linux.

If your I/O wait percentage is greater than (1/# of CPU cores) then your CPUs are waiting a significant amount of time for the disk subsystem to catch up.

In the output above, the I/O wait is 12.1%. 

This server has 8 cores (via cat /proc/cpuinfo). 

This is very close to (1/8 cores = 0.125). 

Disk access may be slowing the application down if the I/O wait is consistently around this threshold.

What impacts I/O performance?

For random disk access (a database, mail server, file server, etc), you should focus on how many input/output operations can be performed per second (IOPS).

Four primary factors impact IOPS:

  • Multidisk Arrays – More disks in the array mean greater IOPS. If one disk can perform 150 IOPS, two disks can perform 300 IOPS.
  • Average IOPS per drive – The greater the number of IOPS each drive can handle, the greater the total IOPS capacity. This is largely determined by the rotational speed of the drive.
  • RAID Factor – Your application is likely using a RAID configuration for storage, which means you’re using multiple disks for reliability and redundancy. Some RAID configurations have a significant penalty for write operations. For RAID 6, every write request requires 6 disk operations. For RAID 1 and RAID 10, a written request requires only 2 disk operations. The lower the number of disk operations, the higher the IOPS capacity. This article has a great breakdown of RAID and IOPS performance.
  • Read and Write Workload – If you have a high percentage of write operations and a RAID setup that performs many operations for each writes request (like RAID 5 or RAID 6), your IOPS will be significantly lower.

Calculating your maximum IOPS

A more exact way to determine just how close you are to your maximum I/O throughput is to calculate your theoretical IOPS and compare it to your actual IOPS. If the numbers are close, there may be an I/O issue.

You can determine theoretical IOPS via the following equation:

I/O Operations Per-Sec =number of disks * Average I/O Operations on 1 disk per-sec
% of reading workload + (Raid Factor * % of write workload)

All but one of the pieces in this equation can be determined from your hardware specs. You’ll need to figure out the read/write workload though – it’s application-dependent. For this, use a tool like sar.

Once you’ve calculated your theoretical IOPS, compare it to the tps column displayed via sar. The tps column indicates the number of transfers per second that were issued to the device. This is your actual IOPS. If the tps is near the theoretical IOPS, you may be at capacity.

What is I/O wait?

I/O wait applies to Unix and all Unix-based systems, including macOS, FreeBSD, Solaris, and yes Linux.

I/O wait (iowait) is the percentage of time that the CPU (or CPUs) were idle during which the system had pending disk I/O requests. (Source: man sar) The top man page gives this simple explanation: “I/O wait = time waiting for I/O completion.” In other words, the presence of I/O wait tells us that the system is idle at a time when it could be processing outstanding requests.


“iowait shows the percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request.” –  iostat man page

When using Linux top and other tools, you’ll notice that a CPU (and its cores) operate in the following states: us (user), sy (system), id (idle), ni (nice), si (software interrupts), hi (hardware interrupts), st (steal) and wa (wait). Of these, the user, system, idle, and wait values should add up to 100%. Note that “idle” and “wait” are not the same. “Idle” CPU means there is no workload present while, on the other hand, “wait” (iowait) indicates when the CPU is waiting in an idle state for outstanding requests.

If the CPU is idle, the kernel will ascertain if there are any pending I/O requests (i.e., SSD or NFS) originating from the CPU. If there are, then the ‘iowait’ counter is incremented. If there’s nothing pending, then the ‘idle’ counter is incremented instead.


I/O wait and Linux server performance

It’s important to note that iowait can, at times, indicate a bottleneck in throughput, while at other times, iowait may be completely meaningless. It’s possible to have a healthy system with high iowait, but also possible to have a bottlenecked system without iowait.

I/O wait is simply one of the indicated states of your CPU / CPU cores. As such, a high iowait means your CPU is waiting on requests, but you’ll need to investigate further to confirm the source and effect.

For example, server storage (SSD, NVMe, NFS, etc.) is almost always slower than CPU performance. Because of this, I/O wait may be misleading, especially when it comes to random read/write workloads. This is because iowait only measures CPU performance, not storage I/O.

Although iowait indicates that the CPU can handle more workload, depending on your server’s workload and the way that load performs computations or makes use of storage I/O, it isn’t always possible to solve I/O wait. Or, not feasible to achieve a near-zero value.

You will have to decide based on end-user experience, database query health, transaction throughput, and overall application health, whether or not the iowait reported, indicates poor Linux system performance, or not.

For example, if you see low iowait of 1 to 4 percent, and you then upgrade the CPU to 2x the performance, the iowait will also increase

A 2x faster CPU with the same storage performance = ~ 2x the wait. You’ll want to consider your workload to determine which hardware you should pay attention to first.


Monitoring and reducing I/O wait related issues




Let’s look at some useful tools we can use to monitor I/O wait on Linux.

  • atop – run it with -d option or press d to toggle the disk stats view.
  • iostat – try it with the -xm 2 options for extended statistics, in megabytes, and in two-second intervals.
  • iotop – top-like I/O monitor. Try it with the -oPa options to show the accumulated I/O of active processes only.
  • ps – use auxf, then under the “STAT” column “D” usually indicates disk iowait.
  • strace – view the actual operations issued by a process. Read the strace man page.
  • lsof – after you’ve identified the process responsible, use -p [PID] to find the specific files.

Reducing I/O wait related issues

Take the following steps to reduce I/O wait related issues.

  • Optimize your application’s code and database queries. This can go a long way in reducing the frequency of disk reads/writes. This should be your first approach because the more efficient your application is, the less you’ll have to spend on hardware longterm. See also: 100 Application Performance Monitoring (APM) & Observability Solutions.
  • Keep your Linux system and software versions up-to-date. Not only is this better for security, but more often than not, the latest supported versions offer notable performance improvements, whether its Nginx, Node.js, PHP, Python, or MySQL.
  • Make sure that you have free memory available. Enough free memory so that around half of the server’s memory is being used for in-memory buffers and cache, rather than swapping and paging to disk. Of course, this ratio will differ case by case. Therefore, be sure that you are not swapping and that kernel cache pressure isn’t high due to a lack of free memory.
  • Tweak your system, storage device(s), and the Linux kernel for increased storage performance and lifespan.
  • Finally, if all else fails: upgrade storage devices to faster SSD, NVMe, or other high throughput storage devices.

 

Conclusion

The iowait statistic is a useful performance stat for monitoring CPU utilization health. It notifies the Sysadmin when CPU is idle and can possibly perform more computations. At which point, we can then use observability, benchmarking, and tracing tools such as those listed and linked to above, to put together a full picture of the system’s overall I/O performance. Your main goal should be to eliminate any iowait that’s a direct result of waiting on disk, NFS, or other storage-related I/O.






No comments:

Post a Comment

How to recovery PDB when PDB database is dropped in Oracle

  How to recovery PDB when PDB database is dropped :) [oracle@rac01 ~]$ sqlplus '/as sysdba' SQL*Plus: Release 21.0.0.0.0 - Product...