Friday, July 10, 2015

windows vista - How do I get to the root cause of high Deferred Procedure Calls?



I've got a Dual core processor, and one of the two is constantly at 100%. Looking in ProcessExplorer shows me that it's Deferred Procedure Calls. Reading around the net seems to give me tons of different answers.



Is it possible to set out a couple of steps to trying to narrow down what the problem might be in my case?




Update 1: FWIW, the problem persists even in Safe Mode.



Update 2: I unplugged everything I could from the back of the PC, and that bought me 40% more free processor. I also downloaded the RATTV3 tool, but for some reason on my machine it's not giving me a driver-by-driver breakdown. There's a good description of both DPCLatencyChecker and RATTV3 here.



Update 3:, LatencyMon (see my answer below) tells me it's nvstor32.sys - which is NVidia's SATA driver - with times of about 5300 µs.



Update 4: The plot thickens, while pondering whether to try booting of a recovery disk (to see if it's really drivers, and not a hardware problem), I noticed that the DVD/CD player wasn't working (i.e. Not even opening the door when I press the button). Given that the machine just got back from having the motherboard replaced, I figured maybe they'd forgotten to plug it in. I opened the box, everything seemed ok, but I unplugged and plugged it back in again. On reboot, everything was fine - no more DPC (highest now 300µs)!



Update 5: On following day, problem back, CD player not working again, even the cursor in the password textbox is blinking in slow motion... Tried unplugging everything I could think of, and on second reboot, worked again (as at Update2). Next time I'm going to try unplugging the CD player completely...




Update 6: Just noticed System event log has nvstor32.sys giving an error saying Parity error detected in \Device\RaidPort0, then a warning about sending a reinitialisation. Now just have to work out which one RaidPort0 is... (note, I've got no RAID set-up, it's just a bog standard Acer). Oh, and my Avast setup apparently got killed when I did a System Rollback (or whatever it's called), because it won't start (RPC error), won't uninstall (setiface error has occurred).



Update 7: Finally got time to reboot with DVD unplugged. No more DPC problems! (lots of page faults though, but that's for later). Next step: work out if it's the cable, or the DVD player.



Update 8: Borrowed a SATA cable, booted with it, no problems. CD/DVD player works, no DPC problems with nvstor32.sys, no processors blocked. Happy end... almost: I've still got problems with Avast, apparent DPC problems with storport.sys at start up (maybe normal for USB?), and lots of hard page faults. But those will be the subject of other questions.



Postscript: I recently started having the same problem, and using the same method, managed to track it down to a USB stick (the one I was using for ReadyBoost) being shot.


Answer



Here is the story of how I found the cause of my high DPC latency.







My system was experiencing clicks and pops during sound playback. I knew this meant that something in kernel mode was hogging the CPU. My first thought was to poke around Process Explorer, and see if anything looked out of place. The only thing that caught my attention was an excessive amount of time spent performing Deferred Procedure Calls (DPCs):



Screenshot of Process Explorer showing high DPC time



I knew that DPCs are code being run inside a driver; the challenge was to figure out which driver. I turned to DPC Latency Checker, which showed me just how bad the latency was:



screenshot of DPC Latency Checker




DPC Latency Checker suggests going through devices in Device Manager, and disabling non-essential hardware one-by-one (e.g. network card, sound card), hoping to isolate the buggy driver. (If you disable a device, and the DPC latency suddenly drops: you've found your culprit!)



screenshot of disabling devices



Unfortunately, after disabling everything I possibly could (while still being able to use the computer — don't disable your hard drive, video card, mouse, or USB hub the mouse is plugged into!), the latency was still high. Next I turned to the the Windows Performance Toolkit (part of the Windows SDK), and an excellent blog post by Peter Weiland, "Measuring DPC Time". After installing the Windows Performance Toolkit:



Screenshot of Windows SDK installer, with Windows Performance Toolkit being selected



I opened an elevated command prompt and ran:




>xperf -on Latency



Note: The Latency group is a
predefined set of events that can be
traced from the Kernel Group
provider:



>xperf -providers kg

Base : PROC_THREAD+LOADER+DISK_IO+HARD_FAULTS+PROFILE+MEMINFO
Diag : PROC_THREAD+LOADER+DISK_IO+HARD_FAULTS+DPC+INTERRUPT+CSWITCH+PERF_COUNTER+COMPACT_CSWITCH
DiagEasy : PROC_THREAD+LOADER+DISK_IO+HARD_FAULTS+DPC+INTERRUPT+CSWITCH+PERF_COUNTER
Latency : PROC_THREAD+LOADER+DISK_IO+HARD_FAULTS+DPC+INTERRUPT+CSWITCH+PROFILE
...


In this case Latency corresponds to
the Kernel Flags:





  • PROC_THREAD Process and Thread create/delete

  • LOADER Kernel and user mode Image Load/Unload events

  • PROFILE CPU Sample profile

  • CSWITCH Context Switch

  • DPC DPC Events

  • INTERRUPT Interrupt events

  • DISK_IO Disk I/O

  • HARD_FAULTS Hard Page Faults





After letting that run for a minute, I stopped the trace, and had it save to a file:



C:\Users\Ian\Desktop\xperf -d thingy1.etl


And then I viewed the results of the trace with the command:



C:\Users\Ian\Desktop\xperf thingy1.etl



This loads the graphical Windows Performance Analyzer. Right clicking on the DPC CPU Usage graph, I selected Summary Table. This shows a breakdown of time spent in DPCs by driver:



screenshot of XPerf output



Right away I can see one driver (tsvp.sys) taking an average of 2.8ms per DPC execution, which is an order of magnitude slower than any other driver:



screenshot




Googling tsvp.sys gave me the answer: CommView, which I had recently installed.



The question now is how to disable this driver. Using AutoRuns, I can see that it's installed as a driver service:



screenshot of autoruns



Using Device Manager, I can disable the service that hosts this driver. First you have to Show hidden devices, then expand the Non-Plug and Play Drivers node:



screenshot of device manager




Finally I could stop the driver service, and I changed it's startup mode from System (meaning the driver is an essential part of Windows, and Windows cannot boot without it), to Demand (meaning I can start the driver when I want to):



screenshot of device manager



Stopping the driver service immediately fixed my DPC latency:



screenshot



I may or may not completely uninstall CommView, but for now I've solved the Case of the High DPC Latency.







Update: Starting with Windows 8 you can no longer see Non-Plug and Play Drivers in Device Manager:




Note Starting with Windows 8 and Windows Server 2012, the Plug-and-Play Manager no longer creates device represetnations for non-PnP (legacy) devices. Thus there are no such devices to view in the Device Manager.
To include hidden devices in Device Manager display, click View and select Show Hidden Devices.




Microsoft took away the feature and replace it with nothing. Good job.




In typical nerd rage, some unhelpful answers:




  • Device manager never showed non pnp drivers

  • Why do you need this?



Fortunately, NirSoft has created a replacement. ServiWin lets you see, stop, and start all services (even the ones Microsoft decided administrators should be allowed to see):




screenshot of ServiWin


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...