Tuesday, July 9, 2019

linux - Howto check disk I/O utilisation per process



Im having a problem with a stalling Linux system and I have found sysstat/sar to report huge peaks in disk I/O utilization, average service time as well as average wait time at the time of the system stall.



How could I go about to determine which process is causing these peaks the next time it happen?
Is it possible to do with sar (ie: can I find this info from the alreade recorded sar files?



Output for "sar -d", system stall happened around 12.58-13.01pm.




12:40:01          DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
12:40:01 dev8-0 11.57 0.11 710.08 61.36 0.01 0.97 0.37 0.43
12:45:01 dev8-0 13.36 0.00 972.93 72.82 0.01 1.00 0.32 0.43
12:50:01 dev8-0 13.55 0.03 616.56 45.49 0.01 0.70 0.35 0.47
12:55:01 dev8-0 13.99 0.08 917.00 65.55 0.01 0.86 0.37 0.52
13:01:02 dev8-0 6.28 0.00 400.53 63.81 0.89 141.87 141.12 88.59
13:05:01 dev8-0 22.75 0.03 932.13 40.97 0.01 0.65 0.27 0.62
13:10:01 dev8-0 13.11 0.00 634.55 48.42 0.01 0.71 0.38 0.50



This is a follow-up question to a thread I started yesterday: Sudden peaks in load and disk block wait, I hope its ok that I created a new topic/question on the matter since I have not been able to resolve the problem yet.


Answer



If you are lucky enough to catch the next peak utilization period, you can study per-process I/O stats interactively, using iotop.


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...