Tuesday, November 20, 2018

central processing unit - Tracking down high Windows Server CPU utilization



I've got a Windows 2003 server (64-bit) running as a VM on a remote hosting facility. (I'm just leasing this one particular virtual instance, so I don't know what sort of underlying hardware it's running on, other than that it's presenting itself to the VM as having 8 CPU's available.)




The problem is that starting about 1-2 weeks ago, taskmgr.exe began showing something like a 60% total CPU load, spread out evenly across 7 of its 8 procs, but with one proc spiked at 100%. And the server is responding like you'd expect when it's that busy: it's a dog. I'd obviously like to track down what's causing this.



The problem is that the CPU %'s for each process, as shown in either taskmgr.exe or procexp.exe, don't add up to anywhere near 100%. In other words, the system idle process is somewhere around 40%, and a few other processes maybe add up another 10%, but where's the other 50% coming from? In other words, something is chewing up 50% of my CPU, and it's not listed anywhere in task manager. ("Show processes from all users" is checked.)



I've tried stopping all the services I could, but none of them had an impact on the CPU. Restarting the server doesn't make any difference: by the time I log back in, the CPU is pegged again. Procexp.exe doesn't show anything out of the ordinary.



I can think of two possible explanations: (1) There's some sort of rootkit that's made its way onto my server and is hiding itself from the process list; or (2) taskmgr.exe is suddenly (and for the first time) showing utilization on the rest of the box, not just this particular instance (though that doesn't seem right).



Any other suggestions for tracking this down?


Answer




I see two possible things you should look into.



First, whenever I hear someone talk about high CPU load, without being able to identify any offending processes, IO contention is my first guess. When there is high IO contention, processes stack up in uninterruptible sleep state, filling the OS's process scheduler with tasks that are just there waiting for data to be read from or written to disk. The individual processes would not show as having high CPU load. You'll need to look at performance statistics for the disk subsystem that's servicing this VM to see if any one or more of the many possible IO bottlenecks are being hit.



Second, you mentioned having an 8-CPU VM. Are you absolutely sure you need that many cores? You're sure? Okay, ask yourself again if you really need them. The reason being is that under virtualization, multiple cores don't work the same as if you were running on bare metal. The only time your VM is going to get CPU time on the host is when there are 8 cores available. If 8 cores aren't available, you don't get CPU cycles. On an even moderately-loaded host, needless to say, it's much more difficult for the hypervisor to schedule CPU time for an 8-core VM than it is for a single-core VM. For this reason, I recommend sticking with a single core unless it's 100% absolutely necessary for the application, at which point I may allocate 2 or at the very most 4 cores, in which case I'll make darn sure that there's not much else going on on the host, so that this VM's performance doesn't suffer.



So - may you have a rootkit? Sure, possibly so, and you had better do some due diligence to determine whether or not that's the case. If not, though, you certainly have some other things to look at.


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...