I have very hard puzzle with blue screens on windows 7.
Few weeks ago a computer started to have some blu screens during smartcard computation. This is a POS machine with a smartcard, for every recepit issued it has to compute an hash in the smartcard.
So I started to looking for drivers/hardware errors, I updated some drivers (also if the old drivers worked for many years without errors)... without success.
Therefore I changed the smartcard and the smartcardreader but it still didn't work. I also tried to switch all the POS system to another computer but every 10-20 hash computation the blue screen appeared again.
I tried to analyze the dump file, and seems that the failing component was ntoskrnl.exe, and this seems not to be a driver connected error.
This is the dump file:
https://drive.google.com/file/d/0B68Lon7XGG2tdzcxLXUwSXJibms/view?usp=sharing
This is the dump details:
KMODE_EXCEPTION_NOT_HANDLED 0x0000001e ffffffff`c0000005 00000000`00000000 00000000`00000008 00000000`00000000 ntoskrnl.exe ntoskrnl.exe+70380
and the dump analysis data:
Crash Dump Analysis provided by OSR Open Systems Resources, Inc. (http://www.osr.com)
Online Crash Dump Analysis Service
See http://www.osronline.com for more information
Windows 7 Kernel Version 7601 (Service Pack 1) MP (4 procs) Free x64
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 7601.23392.amd64fre.win7sp1_ldr.160317-0600
Machine Name:
Kernel base = 0xfffff800`02e4f000 PsLoadedModuleList = 0xfffff800`03091730
Debug session time: Sat May 7 14:48:33.499 2016 (UTC - 4:00)
System Uptime: 0 days 0:24:58.841
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
KMODE_EXCEPTION_NOT_HANDLED (1e)
This is a very common bugcheck. Usually the exception address pinpoints
the driver/function that caused the problem. Always note this address
as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: ffffffffc0000005, The exception code that was not handled
Arg2: 0000000000000000, The address that the exception occurred at
Arg3: 0000000000000008, Parameter 0 of the exception
Arg4: 0000000000000000, Parameter 1 of the exception
Debugging Details:
------------------
TRIAGER: Could not open triage file : e:\dump_analysis\program\triage\modclass.ini, error 2
EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".
FAULTING_IP:
+0
00000000`00000000 ?? ???
EXCEPTION_PARAMETER1: 0000000000000008
EXCEPTION_PARAMETER2: 0000000000000000
WRITE_ADDRESS: GetPointerFromAddress: unable to read from fffff800030fb100
GetUlongFromAddress: unable to read from fffff800030fb1c8
0000000000000000 Nonpaged pool
ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".
BUGCHECK_STR: 0x1e_c0000005
CUSTOMER_CRASH_COUNT: 1
DEFAULT_BUCKET_ID: WIN7_DRIVER_FAULT
PROCESS_NAME: System
CURRENT_IRQL: 1
TRAP_FRAME: fffff8800701b7f0 -- (.trap 0xfffff8800701b7f0)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=fffffa800737ee70 rbx=0000000000000000 rcx=fffffa8008f13a20
rdx=fffffa800737ee70 rsi=0000000000000000 rdi=0000000000000000
rip=0000000000000000 rsp=fffff8800701b980 rbp=0000000000000000
r8=fffffa800398b010 r9=fffff8000303de80 r10=fffffa80036fb570
r11=fffffa8004a55c10 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei ng nz na pe nc
00000000`00000000 ?? ???
Resetting default scope
LAST_CONTROL_TRANSFER: from fffff80002f3f512 to fffff80002ebf380
STACK_TEXT:
fffff880`0701af68 fffff800`02f3f512 : 00000000`0000001e ffffffff`c0000005 00000000`00000000 00000000`00000008 : nt!KeBugCheckEx
fffff880`0701af70 fffff800`02ebea02 : fffff880`0701b748 fffffa80`c0000120 fffff880`0701b7f0 00000000`00000000 : nt! ?? ::FNODOBFM::`string'+0x40e2d
fffff880`0701b610 fffff800`02ebd57a : 00000000`00000008 00000000`00000000 00000000`00000200 fffffa80`c0000120 : nt!KiExceptionDispatch+0xc2
fffff880`0701b7f0 00000000`00000000 : 00000000`00000000 00000000`00000000 fffff880`0507cf00 fffffa80`08239bb8 : nt!KiPageFault+0x23a
STACK_COMMAND: kb
FOLLOWUP_IP:
nt! ?? ::FNODOBFM::`string'+40e2d
fffff800`02f3f512 cc int 3
SYMBOL_STACK_INDEX: 1
SYMBOL_NAME: nt! ?? ::FNODOBFM::`string'+40e2d
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: nt
IMAGE_NAME: ntkrnlmp.exe
DEBUG_FLR_IMAGE_TIMESTAMP: 56eb24e6
FAILURE_BUCKET_ID: X64_0x1e_c0000005_nt!_??_::FNODOBFM::_string_+40e2d
BUCKET_ID: X64_0x1e_c0000005_nt!_??_::FNODOBFM::_string_+40e2d
Followup: MachineOwner
---------
The BSOD happends always at the same operation: the smartcard hash computation, but it is not systematic bacause it can work for many computation before crashing and sometimes it works for some days without errors.
I tried to read about last Windows update of that machine done few days before the first crash:
KB2952664, KB3137061, KB3138901, KB3142042, KB3145739, KB3146706, KB3146963, KB3147071, KB3148198, KB3148851, KB3149090 but nothing seems to be connected to smartcard, I also tried to uninstall them without luck.
After few days another POS computer started to have the same problem with the same error. That was a system with another computer hardware but with the same smartcard, smartcard reader and thermal recepit printer. I underline that those systems well worked for many times and the only recent change I could think of are windows updates.
At the end I solved upgrading O.S. from Windows 7 to Windows 10, but this is not really a solution!
Can you tell me how to read that dump details? Is there any information I'm not considering?
Answer
It looks very much as if your problem is in iusb3xhc.sys, a driver for a USB 3 host controller.
The debugger's built-in analysis tool, the !analyze -v
command, arrived at this conclusion. To use this you must install the "Debugging Tools for Windows" package and configure the symbol file path. Then open the dump file in WinDbg and type !analyze -v at the command prompt.
To do it manually, open the dump file, then use the kv command:
0: kd> kv
Child-SP RetAddr : Args to Child : Call Site
fffff880`0701af68 fffff800`02f3f512 : 00000000`0000001e ffffffff`c0000005 00000000`00000000 00000000`00000008 : nt!KeBugCheckEx
fffff880`0701af70 fffff800`02ebea02 : fffff880`0701b748 fffffa80`c0000120 fffff880`0701b7f0 00000000`00000000 : nt! ?? ::FNODOBFM::`string'+0x40e2d
fffff880`0701b610 fffff800`02ebd57a : 00000000`00000008 00000000`00000000 00000000`00000200 fffffa80`c0000120 : nt!KiExceptionDispatch+0xc2
fffff880`0701b7f0 00000000`00000000 : 00000000`00000000 00000000`00000000 fffff880`0507cf00 fffffa80`08239bb8 : nt!KiPageFault+0x23a (TrapFrame @ fffff880`0701b7f0)
This shows a very short stack. But at the right edge of the last line we see a call from KiPageFault (indicating that processing of a memory access fault was in progress) with a "Trap frame" indication. The trap frame records the state of the processor at the point of the page fault exception. The debugger's .trap
command lets us set the debugger's state (some of it, anyway) to what is recorded in the trap frame:
0: kd> .trap fffff880`0701b7f0
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed.
rax=fffffa800737ee70 rbx=0000000000000000 rcx=fffffa8008f13a20
rdx=fffffa800737ee70 rsi=0000000000000001 rdi=0000000000000000
rip=0000000000000000 rsp=fffff8800701b980 rbp=0000000000000000
r8=fffffa800398b010 r9=fffff8000303de80 r10=fffffa80036fb570
r11=fffffa8004a55c10 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei ng nz na pe nc
00000000`00000000 ?? ???
Now we try the kv
command again:
0: kd> kv
*** Stack trace for last set context - .thread/.cxr resets it
Child-SP RetAddr : Args to Child : Call Site
fffff880`0701b980 00000000`00000000 : 00000000`00000000 fffff880`0507cf00 fffffa80`08239bb8 fffff880`0701ba18 : 0x0
fffff880`0701b988 00000000`00000000 : fffff880`0507cf00 fffffa80`08239bb8 fffff880`0701ba18 fffff880`0507cf60 : 0x0
fffff880`0701b990 fffff880`0507cf00 : fffffa80`08239bb8 fffff880`0701ba18 fffff880`0507cf60 fffffa80`08f13a50 : 0x0
fffff880`0701b998 fffffa80`08239bb8 : fffff880`0701ba18 fffff880`0507cf60 fffffa80`08f13a50 00000000`00000000 : iusb3xhc+0x7cf00
fffff880`0701b9a0 fffff880`0701ba18 : fffff880`0507cf60 fffffa80`08f13a50 00000000`00000000 00000000`0000004f : 0xfffffa80`08239bb8
fffff880`0701b9a8 fffff880`0507cf60 : fffffa80`08f13a50 00000000`00000000 00000000`0000004f fffff880`009f1380 : 0xfffff880`0701ba18
fffff880`0701b9b0 fffffa80`08f13a50 : 00000000`00000000 00000000`0000004f fffff880`009f1380 00000000`00000000 : iusb3xhc+0x7cf60
fffff880`0701b9b8 00000000`00000000 : 00000000`0000004f fffff880`009f1380 00000000`00000000 00000000`00000100 : 0xfffffa80`08f13a50
fffff880`0701b9c0 00000000`0000004f : fffff880`009f1380 00000000`00000000 00000000`00000100 00000000`00000000 : 0x0
fffff880`0701b9c8 fffff880`009f1380 : 00000000`00000000 00000000`00000100 00000000`00000000 00000000`00000001 : 0x4f
fffff880`0701b9d0 00000000`00000000 : 00000000`00000100 00000000`00000000 00000000`00000001 fffff880`009f1f60 : 0xfffff880`009f1380
fffff880`0701b9d8 00000000`00000100 : 00000000`00000000 00000000`00000001 fffff880`009f1f60 fffff800`02eb4fbd : 0x0
fffff880`0701b9e0 00000000`00000000 : 00000000`00000001 fffff880`009f1f60 fffff800`02eb4fbd fffffa80`04807810 : 0x100
fffff880`0701b9e8 00000000`00000001 : fffff880`009f1f60 fffff800`02eb4fbd fffffa80`04807810 00000000`00000000 : 0x0
fffff880`0701b9f0 fffff880`009f1f60 : fffff800`02eb4fbd fffffa80`04807810 00000000`00000000 00000000`00000000 : 0x1
fffff880`0701b9f8 fffff800`02eb4fbd : fffffa80`04807810 00000000`00000000 00000000`00000000 00000000`00000000 : 0xfffff880`009f1f60
fffff880`0701ba00 fffff800`02ec48c2 : 00000000`00000001 00000000`00000000 00000000`0000004f fffffa80`036c66d0 : nt!KiCommitThreadWait+0x3dd
fffff880`0701ba90 fffff800`0319365f : fffffa80`04807b40 fffffa80`04807b40 00000000`00000000 fffffa80`0000004f : nt!KeDelayExecutionThread+0x186
fffff880`0701bb00 fffff800`03193fed : 00000000`00000000 ffffffff`fffe7960 00000000`00000000 00000000`00000000 : nt!IoCancelThreadIo+0x6f
fffff880`0701bb30 fffff800`03194651 : 00000000`00000000 fffff800`03158400 fffffa80`075e6100 00000000`00000000 : nt!PspExitThread+0x58d
This stack is corrupt (notice all the 0's where there should be call site addresses) but it is clear that calls out of the iusb3xhc.sys driver were in process.
Suggested solution: I'm pretty sure that driver was written by Intel. Go to the Intel web site and see if there's a more recent version than the one Microsoft supplies. If there isn't, or if it doesn't help, try earlier ones. Last resort: disable the USB 3 host controller and live with USB 2 speeds until a better driver comes along.
No comments:
Post a Comment