Wednesday, February 22, 2012
A helpful post about always building c++ projects after being upgraded to VS2010
When CLR JIT meets floating overflow exceptions
Last year, one of my colleagues met a weird problem: the managed dlls he developed throw exception when integrated into one version of a CAD system --- MicroStation, while worked well for all other versions. The exception says about float overflow exception.
Then, after investigating the root reason, it is said that CLR JIT needs the floating overflow exception to be masked away, while this flag may be enabled by some applications, especially when developing some SDK in c#, integrating them into some old hosting application, and it may cause trouble.
The http://social.msdn.microsoft.com/Forums/en/clr/thread/b3505262-4e01-4e21-bea6-ce897caf4186 also talks about a similar case.
To fix the problem we can simple call
1: _control87(MCW_EM, MCW_EM);
to mask away the related flags at the entry point of our SDK dll.
Who causes the failure of mt.exe during linking stage of Visual Studio 2010
Recently, I am working on migrating our products to Visual Studio 2010 from VS2008, involving both c++/c# and extensive interops, and found much more chances of linking failure, due to mt.exe failing to access the target exe.
We know that some guys think most of the cases are caused by anti-virus products, and when using VS2008, I also got a few failures, and some of them was caused by devenv.exe, instead of anti-virus, after analyzing the logs. Since not happening too often, I did not dig into that. Recently, too many times I have experiences, so, I tried to investigate it.
My investigation shows that the culprit is Resharper. When troubleshooting with procmon, it shows that before mt.exe fails to open the target, devenv.exe just owns it, and the callstack shows the Resharper is causing troubles, and the opening operation takes more than 2second. I doubt that previously VS2008 is also caused by Resharper. Just because it rarely happens, I did not investigate it further.
To disable Resharper, choose Tool-Option-Resharper-Suspend. After disabling it, the chances have been greatly reduced.
The callstack for my case is as below for your reference:
1: Method instance: (BEGIN=0250fc90)(MD=06001353)[JetBrains.Util.CollectionUtil.ForEach[[System.__Canon, mscorlib]](System.Collections.Generic.IEnumerable`1, System.Action`1)]
2: Method instance: (BEGIN=29bc5910)(MD=06001838)[JetBrains.ReSharper.Psi.Impl.Caches2.CacheUpdater.ExecuteMultiCore[[System.__Canon, mscorlib]](System.Collections.Generic.ICollection`1, System.String, System.Action`1)]
3: Method instance: (BEGIN=29bc54c0)(MD=06001837)[JetBrains.ReSharper.Psi.Impl.Caches2.CacheUpdater.ExecuteMulticoreWithInterrupt[[System.__Canon, mscorlib]](System.Collections.Generic.ICollection`1, System.String, System.Action`1)]
4: Method instance: (BEGIN=29bc4fd0)(MD=06001853)[JetBrains.ReSharper.Psi.Impl.Caches2.CacheUpdater+AddAssembliesJob.Do(JetBrains.Application.Progress.IProgressIndicator)]
5: Method instance: (BEGIN=29bc3130)(MD=06001932)[JetBrains.ReSharper.Psi.Impl.Caches2.CacheUpdateThread.Run_ExecJob(JetBrains.ReSharper.Psi.Caches.Job, JetBrains.ReSharper.Psi.Impl.Caches2.CacheWorkItemSubprogress)]
6: Method instance: (BEGIN=1a2b3480)(MD=06001931)[JetBrains.ReSharper.Psi.Impl.Caches2.CacheUpdateThread.Run()]
7: Method instance: (BEGIN=1120f350)(MD=06000aeb)[JetBrains.Util.Logger.Catch(System.Action)]
8: Method instance: (BEGIN=1a2b33d0)(MD=06001935)[JetBrains.ReSharper.Psi.Impl.Caches2.CacheUpdateThread.b__1()]
Tuesday, February 21, 2012
One possible solution about a c++ static object destruction problem
I just found an interesting post from http://www.missdeer.com/articles/1560 (sorry, a post written in Chinese, you can google-translate it).
In the post, they had a problem in their products, when process enters _exit() call, the CLR cleanup codes calls into an global object which has been destroyed, and it crashes.
Though I don’t know what kind of situation they may have, but as far as I know, we can choose “#pragma init_seg()” family to control the global static object construction and destroy order. For instance, “#pragma init_seg(compiler)” can guarantee objects defined in it will be constructed before any other objects, and destroyed after all others. Like cin and cout in CRT is implemented with that option.
The init_seg family also include:
#pragma init_seg(lib)
#pragma init_seg(user)
#pragma init_seg(“user defined segment”)
The construction order follows the above sequence, the compiler group constructs before lib group, which created before user group, and so on.
The destroy order is reverse to the construction.
Though have not tried their case, but may be the solution.
Assembly loading failure due to Zone.Identifier alternate file stream
Recently, one of my colleague sent some experimental SDK assemblies to internal developer in China, and found they could not be loaded at all. After collecting logs and the dump when the exception was thrown, finally found that it is due to the attached alternate file stream named as “Zone.Identifier”.
We know that the explorers like IE, Firefox will attach an Alternate File Stream to the downloaded dlls or documents, like CHM file will not be opened after downloaded until unlocked via file property dialog first, the same thing happens to CLR assemblies. So, after manually unlock the assembly, everything is ok then.
Friday, February 17, 2012
Find COM call’s target process/thread info
Last month, one of my colleague found his program just stuck in some out-of-proc COM call, but not knowing who is the target process/thread. After getting a dump, and try to apply the siepubext!comcalls, the output is:
1: 0:000> !comcalls
2: Thread 0 - STA
3: Target Process ID: 24b83444 = 616051780
4: Target Thread ID: 1591da22 (STA - Possible junk values)
Obviously unreasonable output. Since the extension has been too old, may not correct any more, we need to find another way.
The original callstack is:
1: ChildEBP RetAddr Args to Child
2: 0084f054 758e0bdd 00000002 0084f0a4 00000001 ntdll!NtWaitForMultipleObjects+0x15
3: 0084f0f0 76c21a2c 0084f0a4 0084f118 00000000 KERNELBASE!WaitForMultipleObjectsEx+0x100
4: 0084f138 76b3086a 00000002 fffde000 00000000 kernel32!WaitForMultipleObjectsExImplementation+0xe0
5: 0084f18c 755d2bf1 00000048 0084f1d8 000003e8 user32!RealMsgWaitForMultipleObjectsEx+0x14d
6: 0084f1b8 755d2d31 0084f1d8 000003e8 0084f1e8 ole32!CCliModalLoop::BlockFn+0xa1
7: 0084f1e0 756ed2f6 ffffffff 19eab9d0 0ec2c48c ole32!ModalLoop+0x5b
8: 0084f1fc 756ed098 00000000 0084f304 00000000 ole32!ThreadSendReceive+0x12d
9: 0084f228 756ecef0 0084f2f0 0eca2670 0084f34c ole32!CRpcChannelBuffer::SwitchAptAndDispatchCall+0x1a7
10: 0084f308 755d2cba 0eca2670 0084f434 0084f41c ole32!CRpcChannelBuffer::SendReceive2+0xef
11: 0084f324 755e9aa1 0084f434 0084f41c 0eca2670 ole32!CCliModalLoop::SendReceive+0x1e
12: 0084f3a0 755e9b24 0eca2670 0084f434 0084f41c ole32!CAptRpcChnl::SendReceive+0x73
The CRpcChannelBuffer:: SendReceiver2’s argument can be used to find the target process/thread info.
1: 0:000>; dd 0eca2670
2: 0eca2670 75607c08 755e92c0 00000003 0000000a
3: 0eca2680 00000000 00000000 0027a960 0027c340
4: 0eca2690 0ec2c488 0b4ede60 75606e70 00070005
5: 0eca26a0 00000000 000024b8 00002c28 00000000
6: 0eca26b0 75607c08 755e92c0 00000001 00000001
7: 0eca26c0 00000000 00000000 0027a960 00000000
8: 0eca26d0 00000000 0b4edf50 75606e70 00070005
9: 0eca26e0 00000000 000024b8 00002c28 00000000
10: 0:000>; dd 0027a960
11: 0027a960 0ef674f0 0027a8e0 00003444 000024b8
12: 0027a970 744021e9 eb1dc45b e6e85a30 79543058
The above “00003444 000024b8” are process ID and thread ID.
Wednesday, February 15, 2012
Make ILDASM to open assemblies encrypted by .Net Reactor
Our product assemblies are encrypted by EZIRIZ’s .Net Reactor (http://www.eziriz.com/), and normally the ILDASM cannot open it due to the SuppressIldasmAttribute added to the assembly, so we usually choose to use ILSpy, since it ignores that and still convenient. But today, one of the product assembly seems to have some problem after encryption, though from windbg we can still check whatever we want, I still prefer to open it with a disassembly tool to make it easy. But this time, ILSpy fails and shows some exception when trying to display codes of functions
Since it is just the SuppressIldasmAttribute prevents ILDASM from opening the assembly, we can simply find a HEX editor, search the “SuppressIldasmAttribute” string, and zero them, then ILDasm can continue. So, the traditional robust tool can still help usand it seems that it is kind of more robust than some new tools.
Sunday, February 12, 2012
Unexpected changes in VS 2005’s *proj files like <Service Include="{B4F97281-0DBD-4835-9ED8-7DFB966E87FF}" />
Recently, one of my colleagues found unexpected changes in the *proj files of his local VS2005 code bases. Since not sure what they mean, he could not commit them into server.
After google it, it is said that the VS2005 SDK with the DSL tools has some bug in Text Template Service, and it has been for long time and not seems to be fixed yet.
One of the solution is to disable TT service by changing the registry, since we are not using TT, that should be fine.
Please refer to below link:
http://social.msdn.microsoft.com/forums/en-US/vsx/thread/aba82b76-2d7c-4de8-9f61-19938976bdbd/
Saturday, February 11, 2012
StingrayStudio 10.4 has a 64bit bug
When upgrading our product to 64bit in the end of last year, I found a bug of StingrayStudio 10.4, which causes a 64bit pointer truncated when converting it to DWORD.
refer to “Toolkit/trcore.inl”, search _WIN64, and you can find below codes:
1: #ifdef _WIN64 //RW64
2: // Possible pointer truncation.
3: VERIFY(((SEC_TREEBASE*)this)->SetItemData(nIndex++, reinterpret_cast<DWORD>(pNodeLoop)));
4: #else
5: VERIFY(((SEC_TREEBASE*)this)->SetItemData(nIndex++, (DWORD)pNodeLoop));
6: #endif //_WIN64
It seems that the bug has been mentioned in the comments, while no fix applied. Changing the _WIN64 case to reinterpret_cast<DWORD_PTR> can solve the problem. Then the StingrayStudio needs to be rebuilt.
The bug may not happen if the memory address is below 4g, so, either make sure the memory address is higher enough or enable 64bit application’s TOP_DOWN memory allocation strategy by updating the registry and restarting the machine.
TestComplete 7.5 hangs after started
I am trying to upgrade our product line to .net 4.0 and vc10, to guarantee no big problems, I installed TestComplete 7.5 to run QA’s test scripts against the new build. In the past upgrade of 64bit, this worked well and most of the bugs had been identified.
While due to unknown reason, my TestComplete just hangs after starting, and works well on QA’s machine, really kind of headache. Then, attached it with windbg, and also analyzed it with spyxx, finally found TC just being blocked at NtUserGetMessage function, after dumping the memory block the first arg points to, the window handle can be found and it is an “internet_explorer_hidden” by verifying with spyxx, seems belonging to splash screen.
Checking the command line option, and found “/ns” can start TC without splash screen. Try and it works
Just record it for other guys who may also face the same problems.