Saturday, October 24, 2009

Heap corruption in managed code

It has been reported that the SSH server would crash during download when the client changed the max packet size from 30000 to 50000. It seemed to be a classic example of buffer overflow.

My initial suspect was the heap corruption due to the buffer overflow.

The SSH server is written in C# with its data access and crypto library in native C++. When I used the Application Verifier with full page heap enabled, the windbg showed the following error.

(1244.1004): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=21630953 ebx=00000000 ecx=02f2b024 edx=00000000 esi=02efae5c edi=09f6f0ec
eip=09b5b6eb esp=09f6efdc ebp=09f6f0fc iopl=0 nv up ei pl zr na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
Missing image name, possible paged-out or corrupt data.
+0x9b5b6ea:
09b5b6eb ff505c call dword ptr [eax+5Ch] ds:0023:216309af=????????

0:019> kb
ChildEBP RetAddr Args to Child
WARNING: Frame IP not in any known module. Following frames may be wrong.
09f6f0fc 09b5a602 02f02ba0 02f68a50 02f2b024 +0x9b5b6ea
*** WARNING: Unable to verify checksum for C:\WINDOWS\assembly\NativeImages_v2.0.50727_32\System\3de5bd01124463d7862bd173af90bc83\System.ni.dll
09f6f410 7a57ee09 0148c220 02efad80 02f30b08 +0x9b5a601
09f6f448 7a581eba 02f689f8 792f5681 00000000 System_ni+0x13ee09
09f6f460 79e71b4c 09f6f480 0147fb6a 09f6f4f0 System_ni+0x141eba
09f6f470 79e821b9 09f6f540 00000000 09f6f510 mscorwks!CallDescrWorker+0x33
09f6f4f0 79e96531 09f6f540 00000000 09f6f510 mscorwks!CallDescrWorkerWithHandler+0xa3
09f6f630 79e96564 79241ff0 09f6f764 09f6f684 mscorwks!MethodDesc::CallDescr+0x19c
09f6f64c 79e96582 79241ff0 09f6f764 09f6f684 mscorwks!MethodDesc::CallTargetWorker+0x1f
09f6f664 79f6a259 09f6f684 80acc23f 02a3a6b8 mscorwks!MethodDescCallSite::CallWithValueTypes_RetArgSlot+0x1a
09f6f830 79f6a3ae 09f6f8c0 80acc2ef 02f68a40 mscorwks!ExecuteCodeWithGuaranteedCleanupHelper+0x9f
*** WARNING: Unable to verify checksum for C:\WINDOWS\assembly\NativeImages_v2.0.50727_32\mscorlib\7124a40b9998f7b63c86bd1a2125ce26\mscorlib.ni.dll
09f6f8e0 792f5577 09f6f884 02f30b6c 02f689d8 mscorwks!ReflectionInvocation::ExecuteCodeWithGuaranteedCleanup+0x10f
09f6f8fc 792e01c5 00000000 02f30b6c 02f30b08 mscorlib_ni+0x235577
09f6f914 7a5825b1 00000000 00000000 00000000 mscorlib_ni+0x2201c5
09f6f930 7a57ed70 02f30b08 00000000 00000000 System_ni+0x1425b1
09f6f95c 7a5824b4 00000000 02f30b08 00000000 System_ni+0x13ed70
09f6f994 7928cdc4 02f2c39c 00000060 00000000 System_ni+0x1424b4
09f6f9b4 79e71b4c 02f2c39c 09f6f9d8 0147fb6a mscorlib_ni+0x1ccdc4
09f6f9c8 79e821b9 09f6fb64 00000001 09f6fb58 mscorwks!CallDescrWorker+0x33
09f6fa48 79e8281f 09f6fb64 00000001 09f6fb58 mscorwks!CallDescrWorkerWithHandler+0xa3
09f6fa68 79e82860 09f6fb60 00000001 09f6fb58 mscorwks!DispatchCallBody+0x1e

0:019> !clrstack
OS Thread Id: 0x1004 (19)
ESP EIP
09f6efdc 09b5b6eb SSHServerAPI.Transport.Core._ProcessReadPacket(SSHCommonAPI.SSH2BufferStream, Byte[], Byte[])
09f6f10c 09b5a602 SSHServerAPI.Transport.Core._OnPacketRecv(System.IAsyncResult)
09f6f418 7a57ee09 System.Net.LazyAsyncResult.Complete(IntPtr)
09f6f450 7a581eba System.Net.ContextAwareResult.CompleteCallback(System.Object)
09f6f458 792f5681 System.Threading.ExecutionContext.runTryCode(System.Object)
09f6f884 79e71b4c [HelperMethodFrame_PROTECTOBJ: 09f6f884] System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup(TryCode, CleanupCode, System.Object)
09f6f8ec 792f5577 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
09f6f908 792e01c5 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
09f6f920 7a5825b1 System.Net.ContextAwareResult.Complete(IntPtr)
09f6f938 7a57ed70 System.Net.LazyAsyncResult.ProtectedInvokeCallback(System.Object, IntPtr)
09f6f968 7a5824b4 System.Net.Sockets.BaseOverlappedAsyncResult.CompletionPortCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)
09f6f9a0 7928cdc4 System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)
09f6fb40 79e71b4c [GCFrame: 09f6fb40]


It indicated an access violation exception at 09b5b6eb. It seemed that EAX had been overwritten somewhere before. The stack trace did not provide any very useful clue on the location where the buffer overflow occurred. However, from the stack trace, the access violation seemed to happen in the SSH server managed code layer.

Then I used the SOS command - VerifyHeap to verify the integrity of the managed heap.

0:019> !VerifyHeap
-verify will only produce output if there are errors in the heap
object 02efae5c: bad member 02f2b024 at 02efae84
curr_object : 02efae5c
Last good object: 02efae2c
----------------


0:019> !do 02efae5c
Name: SSHServerAPI.Transport.Core
MethodTable: 09205efc
EEClass: 09b43250
Size: 180(0xb4) bytes
(C:\ftp.server.7.5\debug\SSHServerAPI.dll)
Fields:
MT Field Offset Type VT Attr Value Name
...
79333470 40001df 20 System.Byte[] 0 instance 02f22760 m_buffer
79333470 40001e0 24 System.Byte[] 0 instance 02f19e9c m_ReadBuf
09206af8 40001e1 28 ....SSH2BufferStream 0 instance 02f2b024 m_ReadStream
...

0:019> !do 02f2b024
Invalid object

It indicated that m_ReadStream had been corrupted.

0:019> dd 02f2b024 L1
02f2b024 21630953

0:019> !dumpheap -type SSH2BufferStream
Address MT Size
02efde10 09206af8 60
02f05340 09206af8 60
02f14eec 09206af8 60
02f15074 09206af8 60
02f15304 09206af8 60
02f154a0 09206af8 60
02f15840 09206af8 60
02f15ae0 09206af8 60
object 02f2b024: does not have valid MT
curr_object : 02f2b024
Last good object: 02f22760
----------------
total 8 objects
Statistics:
MT Count TotalSize Class Name
09206af8 8 480 SSHCommonAPI.SSH2BufferStream
Total 8 objects

0:019> !do 02f15ae0
Name: SSHCommonAPI.SSH2BufferStream
MethodTable: 09206af8
EEClass: 09b45adc
Size: 60(0x3c) bytes
(C:\ftp.server.7.5\debug\SSHCommonAPI.dll)
Fields:
MT Field Offset Type VT Attr Value Name
7933061c 400018a 4 System.Object 0 instance 00000000 __identity
...

0:019> dd 02f15ae0 L1
02f15ae0 09206af8


Internally, a managed object starts with 8 bytes of metadata -- first 4 bytes for the sync block index, and the next 4 bytes for the method table address. It seems that the address of a managed object in the windbg points to its method table address. So by comparing the method table addresses between an invalid object and valid one, it confirmed that the method table address of m_ReadStream at 02f2b024 had been overwritten.

Since right above m_ReadStream is m_ReadBuf. It happens to be a byte array. Even more, the array size has been hard-coded to 35000. When I changed its size to 40000, the SSH server no longer crashed during download. It looked like that the buffer overflow of m_ReadBuf was the culprit. Then I tried to understand where and how in the code the buffer overflow happened. I found that m_ReadBuf was solely used to read client responses. During download, each client response would be much smaller that 35000. So m_ReadBuf should not be overflowed.

Now the question is why there was no crash in the server when m_ReadBuf size was changed from 35000 to 40000.

One possible explanation could be that when the m_ReadBuf size was 35000, the overflow happened to overwrite the next object; when the m_ReadBuf size is 40000, the next object was moved back by 5000 bytes, so when the overflow happened, it just overwrote some unused memory, and therefore there was no crash!

Since a managed object in the managed heap starts with its method table address, which would not be changed throughout its lifetime, so a data breakpoint could be used to break the code when the method table address is overwritten. By this way, the stack trace would display the exact location when the overflow occurs.

The following steps were used to set a data breakpoint.

0:022> !dumpheap -type SSHServerAPI.SFTP.Subsystem
Address MT Size
02f1af4c 0920af7c 84
total 1 objects
Statistics:
MT Count TotalSize Class Name
0920af7c 1 84 SSHServerAPI.SFTP.Subsystem
Total 1 objects

0:022> !dumpmt -md 0920af7c
EEClass: 0a1d64b4
Module: 02534164Name: SSHServerAPI.SFTP.Subsystem
mdToken: 02000033 (C:\ftp.server.7.5\debug\SSHServerAPI.dll)
BaseSize: 0x54
ComponentSize: 0x0
Number of IFaces in IFaceMap: 1Slots in VTable: 72
--------------------------------------
MethodDesc Table
Entry MethodDesc JIT Name
...
06ad4691 0920ad10 NONE SSHServerAPI.SFTP.Subsystem.FXP_Read()
...

0:022> !bpmd -md 0920ad10
MethodDesc = 0920ad10
Adding pending breakpoints...
sxe -c "!bpmd -notification;g" clrn


First, set a managed breakpoint at SSHServerAPI.SFTP.Subsystem.FXP_Read, which is a core function for download.

0:022> g
Closing _RecordsetPtr (m_rset)Closing _RecordsetPtr (m_rset)ModLoad: 605d0000 605d9000 C:\WINDOWS\system32\mslbui.dll
(111c.f2c): CLR notification exception - code e0444143 (first chance)
JITTED SSHServerAPI!SSHServerAPI.SFTP.Subsystem.FXP_Read()
Setting breakpoint: bp 0A446030 [SSHServerAPI.SFTP.Subsystem.FXP_Read()]
bp 0A446030
Breakpoint: JIT notification received for method SSHServerAPI.SFTP.Subsystem.FXP_Read().
Breakpoint 0 hit
eax=0920ad10 ebx=00000000 ecx=02f05e9c edx=00000002 esi=09e6ef64 edi=02f001f4
eip=0a446030 esp=09e6ecec ebp=09e6edd8 iopl=0 nv up ei pl nz na pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000206
+0xa44602f:
0a446030 55 push ebp

0:018> !dumpheap -type SSHServerAPI.Transport.Core
Address MT Size
02efae04 09205efc 180 ThinLock owner 6 (0132efa0) Recursive 0
02f224d8 09205efc 180
total 2 objects
Statistics:
MT Count TotalSize Class Name
09205efc 2 360 SSHServerAPI.Transport.Core
Total 2 objects

0:018> !do 02efae04
Name: SSHServerAPI.Transport.Core
MethodTable: 09205efc
EEClass: 09b42cac
Size: 180(0xb4) bytes
(C:\ftp.server.7.5\debug\SSHServerAPI.dll)
Fields:
MT Field Offset Type VT Attr Value Name
...
79333470 40001df 20 System.Byte[] 0 instance 02f1064c m_buffer
79333470 40001e0 24 System.Byte[] 0 instance 02f07d88 m_ReadBuf
09206af8 40001e1 28 ....SSH2BufferStream 0 instance 02f18f10 m_ReadStream
...

0:018> !do 02f18f10
Name: SSHCommonAPI.SSH2BufferStream
MethodTable: 09206af8
EEClass: 09b45538
Size: 60(0x3c) bytes
(C:\ftp.server.7.5\debug\SSHCommonAPI.dll)
Fields:
MT Field Offset Type VT Attr Value Name
7933061c 400018a 4 System.Object 0 instance 00000000 __identity
...

0:018> ba w4 02f18f10

When the breakpoint hit, retrieved the address of m_ReadStream, checked whether the object was still valid, and then set a data breakpoint on its first four bytes, which contained the address of its method table.

There were two instances of SSHServerAPI.Transport.Core, the same steps could be used to set a data breakpoint.

0:018> g
Breakpoint 1 hit
eax=00000058 ebx=02f18f04 ecx=7ae6d4c1 edx=0000000c esi=02f18ef4 edi=041429f8
eip=0fb76a99 esp=09e6e604 ebp=00000010 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
*** WARNING: Unable to verify checksum for C:\ftp.server.7.5\debug\IPSLIBEAY32.dll
IPSLIBEAY32!AES_cbc_encrypt+0xc9:
0fb76a99 42 inc edx

0:018> !do 02efae04
Name: SSHServerAPI.Transport.Core
MethodTable: 09205efc
EEClass: 09b42cac
Size: 180(0xb4) bytes
(C:\ftp.server.7.5\debug\SSHServerAPI.dll)
Fields:
MT Field Offset Type VT Attr Value Name
...
79333470 40001df 20 System.Byte[] 0 instance 02f1064c m_buffer
79333470 40001e0 24 System.Byte[] 0 instance 02f07d88 m_ReadBuf
09206af8 40001e1 28 ....SSH2BufferStream 0 instance 02f18f10 m_ReadStream
...

0:018> !do 02f18f10
Invalid object

When the breakpoint hit, checked that m_ReadStream was indeed no longer valid.

0:018> kb
ChildEBP RetAddr Args to Child
09e6e600 79e71bb0 0000000e 0148c220 0fbc98b0 IPSLIBEAY32!AES_cbc_encrypt+0xc9
09e6e62c 0fb95fb7 0413a158 02f10664 000088b0 mscorwks!GetThreadGeneric+0xe
09e6e64c 0fb96bfa 05c97af0 02f10654 05c97b20 IPSLIBEAY32!EVP_aes_192_ecb+0x37
09e6e65c 0fb96b79 05c97af0 02f10664 0413a158 IPSLIBEAY32!EVP_EncryptUpdate+0x12a
09e6e670 79e71d71 09e6e6a8 00000008 0132efa0 IPSLIBEAY32!EVP_EncryptUpdate+0xa9
09e6e68c 79e71d8b 05c97af0 02f10664 09e6e730 mscorwks!PInvokeCalliWorker+0x35
09e6e7b8 7c929fef 0148c220 02fd788c 02fd78c8 mscorwks!PInvokeCalliReturnFromCall
09e6e818 02fd76f8 02fd76f8 02fd7358 02fd7358 ntdll!RtlAcquireResourceShared+0x120
WARNING: Frame IP not in any known module. Following frames may be wrong.
09e6e824 02fd7358 02fd7358 00000000 00000000 +0x2fd76f7
09e6e828 02fd7358 00000000 00000000 02fd76f8 +0x2fd7357
09e6e82c 00000000 00000000 02fd76f8 02fd71a4 +0x2fd7357

0:018> !clrstack
OS Thread Id: 0xf2c (18)
ESP EIP
09e6e6fc 0fb76a99 [PInvokeCalliFrame: 09e6e6fc]
09e6e71c 09b59311 UtilAPI.SymmetricTransform.TransformBlock(Byte[], Int32, Int32, Byte[], Int32)
09e6e75c 09b587d6 SSHServerAPI.Transport.Core.SendPacket(SSHCommonAPI.Transport.PacketOut)
09e6e9c8 0a1ebc7e SSHServerAPI.Transport.Channel.SendSSHPackets(SSHCommonAPI.SSH2BufferStream ByRef)
09e6ea7c 0a1eb6c1 SSHServerAPI.Transport.Channel.SendPacket(SSHCommonAPI.SFTP.SFTPPacket)
09e6eb70 0a4465ed SSHServerAPI.SFTP.Subsystem.FXP_Read()
09e6ecf0 0a1eabb0 SSHServerAPI.SFTP.Subsystem.DispathPacket()
09e6ede0 0a1ea5c7 SSHServerAPI.SFTP.Subsystem.ProcessData(SSHCommonAPI.SSH2BufferStream)
09e6ef0c 0a1ea0fa SSHServerAPI.Transport.Channel.ChannelData(SSHCommonAPI.SSH2BufferStream)
09e6ef94 0a1e9e79 SSHServerAPI.Transport.PacketDispatch.HandleChannelData()
09e6efcc 0a1e7a89 SSHServerAPI.Transport.PacketDispatch._ChannelDispatch()
09e6f01c 09b5bb59 SSHServerAPI.Transport.PacketDispatch.Process(SSHCommonAPI.SSH2BufferStream, Byte[], Byte[])
09e6f064 09b5b7cb SSHServerAPI.Transport.Core._ProcessReadPacket(SSHCommonAPI.SSH2BufferStream, Byte[], Byte[])
09e6f18c 09b5a72a SSHServerAPI.Transport.Core._OnPacketRecv(System.IAsyncResult)
09e6f498 7a57ee09 System.Net.LazyAsyncResult.Complete(IntPtr)
09e6f4d0 7a581eba System.Net.ContextAwareResult.CompleteCallback(System.Object)
09e6f4d8 792f5681 System.Threading.ExecutionContext.runTryCode(System.Object)
09e6f904 79e71b4c [HelperMethodFrame_PROTECTOBJ: 09e6f904]
System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup(TryCode, CleanupCode, System.Object)
09e6f96c 792f5577 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext,
System.Threading.ContextCallback, System.Object)
09e6f988 792e01c5 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
09e6f9a0 7a5825b1 System.Net.ContextAwareResult.Complete(IntPtr)
09e6f9b8 7a57ed70 System.Net.LazyAsyncResult.ProtectedInvokeCallback(System.Object, IntPtr)
09e6f9e8 7a5824b4 System.Net.Sockets.BaseOverlappedAsyncResult.CompletionPortCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)
09e6fa20 7928cdc4 System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)
09e6fbc0 79e71b4c [GCFrame: 09e6fbc0]

From the stack trace, it seemed that the overwrite happened during AES encryption. Now I have the exact location where the overwrite occurred!

0:018> uf IPSLIBEAY32!AES_cbc_encrypt
IPSLIBEAY32!AES_cbc_encrypt:
0fb769d0 55 push ebp
0fb769d1 57 push edi
0fb769d2 56 push esi
0fb769d3 53 push ebx
0fb769d4 83ec1c sub esp,1Ch
0fb769d7 837c244401 cmp dword ptr [esp+44h],1 ; a flag
0fb769dc 8b7c2430 mov edi,dword ptr [esp+30h] ; input buffer
0fb769e0 8b5c2434 mov ebx,dword ptr [esp+34h] ; output buffer
0fb769e4 8b6c2438 mov ebp,dword ptr [esp+38h] ; length
0fb769e8 8b742440 mov esi,dword ptr [esp+40h]
0fb769ec 0f848f000000 je IPSLIBEAY32!AES_cbc_encrypt+0xb1 (0fb76a81)
...
IPSLIBEAY32!AES_cbc_encrypt+0xc0:
0fb76a90 8a0432 mov al,byte ptr [edx+esi]
0fb76a93 32043a xor al,byte ptr [edx+edi]
0fb76a96 88041a mov byte ptr [edx+ebx],al
0fb76a99 42 inc edx
0fb76a9a 83fa0f cmp edx,0Fh
0fb76a9d 76f1 jbe IPSLIBEAY32!AES_cbc_encrypt+0xc0 (0fb76a90)

IPSLIBEAY32!AES_cbc_encrypt+0xcf:
0fb76a9f ebc5 jmp IPSLIBEAY32!AES_cbc_encrypt+0x96 (0fb76a66)
...

0:018> r
eax=00000058 ebx=02f18f04 ecx=7ae6d4c1 edx=0000000c esi=02f18ef4 edi=041429f8
eip=0fb76a99 esp=09e6e604 ebp=00000010 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
IPSLIBEAY32!AES_cbc_encrypt+0xc9:
0fb76a99 42 inc edx


From the assembly, EDX was the counter, EDI was the input buffer, and EBX was the output buffer. From the instruction at 0fb76a96, (EDX+EBX) = (0000000c+02f18f04) = 02f18f10. And 02f18f10 happened to be the address for m_ReadStream. So somehow during the encryption, the output buffer had been written out of its boundary.

With those information, I traced the code again and found during the AES CBC mode encryption, the output ciphertext sometimes could be longer than the input plaintext. However, the code assumed that the input plaintext and output ciphertext would always have the same length. So when the ciphertext was longer than the input plaintext, the output buffer would be written out of its boundary.

Friday, October 16, 2009

Debug a child process

Sometimes the debugee is a child process, which could only be launched by its parent process.

The following two cases need to be considered.
First, the parent process could be launched directly, such as the command prompt. In that case, windbg -o processname could be used to debug its all child process. For instance, Why did DllRegisterServer return 80070006? provides an example on how to debug a specific child process launched by the command prompt.

Second, the parent process could not be launched directly, such as services. In that case, the windbg meta-command .childdbg (Debug Child Processes) could be used.

For instance, to debug aspnet_wp.exe -- an IIS worker process from its startup, the following steps could be used.
1. iisreset
It will kill the aspnet_wp.exe.
2. attach windbg to inetinfo.exe
3. .childdbg 1
It will enable child process debugging in the windbg.
4. send an asp .net request to IIS
IIS will launch aspnet_wp.exe when the first asp .net request comes in.

Thursday, October 8, 2009

Pseudo-registers for Visual Studio IDE

The post by Gregg -- Whidbey Debugger pseudo-register - $user introduces a new pseudo-register -- $user. The pseudo-register provides "loads of information about the debuggee user" and is helpful when it comes to debugging security related issues. Besides, '@err,hr' will print the last win32 error formatted as an error message. It is also very handy.

The post by Kenny Kerr -- X64 Debugging With Pseudo Variables And Format Specifiers provides a more comprehensive list of pseudo-registers.