Designing .NET Class Libraries: CLR Performance Tips (March 09, 2005)

Article
01/21/2022

Posted: June 28, 2005

Please note: Portions of this transcript have been edited for clarity

Introduction

frankred [MS] (Moderator):
Hello & welcome to today’s Designing .NET Class Libraries chat on CLR Performance.

Krzysztof Cwalina (Expert):
Hi, this is Krzysztof Cwalina. I am Program Manager on the CLR team responsible for the Design Guidelines. I work very closely with Rico on ensuring that the guidelines promote good performance practices.

Rico Mariani (Expert):
Hello everyone. I'm Rico Mariani, a architect on the CLR Performance team and I'm here to try to answer as many questions as I can about managed code performance and performance culture. Thank you all for coming it's great to see so many of you here today. I'm going to start answering questions from the list now.. yikes I'm behind already!

Start of Chat

Rico Mariani (Expert):
Q: Hey Rico. How much of a performance difference is there between boxing vs. parsing. For instance: return (bool)ViewState["myvar"]; or return bool.parse(ViewState["myvar"].ToString());
A: Parsing is a total disaster compared to boxing. ToString will make a string then it will be parsed. Actually in that example there's just unboxing which is really cheap. Very few cases of boxing would ever be more expensive than making a temp string -- it's not *that* bad to box

Rico Mariani (Expert):
Q: In theory, if written properly, can a winform app run as fast as an MFC app? We need to be able to run on Citrix with 40+ users on one box.
A: Yes. Most MFC applications I wrote weren't compute bound anyway, and they had complex memory management issues. You could come out ahead using winforms if you're careful. Not to mention it's much easier to do cool things like background worker threads in managed code.

BradA [MS] (Expert):
Q: In theory, if written properly, can a winform app run as fast as an MFC app? We need to be able to run on Citrix with 40+ users on one box.
A: Hello! Welcome to the chat today... I am Brad Abrams, a PM on the CLR team...

Rico Mariani (Expert):
Q: What is best practice for storing / returning collections of objects/arrays in immutable strucs ? Should we clone data on construction and/or retrival ?
A: I can't say that there's one best practice -- it's going to depend on the semantics you need for the collection. Even immutable structs might have embedded object references for instance -- should they be copied? Deep copied? It's all going to vary depending on your needs. But, generally, if you want things immutable what you do is copy everything you need at construction time and leave it that way.

Rico Mariani (Expert):
Q: At what point in the development process should I start measuring? Or are there indicators to look for to decide when to start?
A: Never too soon to start. You'll need measurments to help you make design decisions... make the necessary prototypes to make good choices. Temper your measurement vigor with attentiveness to risk. At each stage you should be measuring to help control risk. Little risk little need to measure.

Krzysztof Cwalina (Expert):
Q: what is the preferred method for testing to see if an object is contained in a small collection? Should I try and retrieve the item from the collection using its key inside a try...catch block, or just enumerate the collection items and search that way?
A: It depends on the datastructure (collection). Are you talking about any particular collection in the Framework?

Rico Mariani (Expert):
Q: Is there a way to really force the CLR to release unused memory after gargabe collection. Our app can go from > 150 MB to ~16MB when the CLR releases the memory. But this only happens as CLR wishes.
A: You might be able to do this with the Hosting API's if you host the CLR but otherwise I don't know a way of doing this. Funny you should ask though because the same issue came up just recently with an internal group. Maybe this is a more prevalent problem than we were aware of. Please get an issue into the ladybug database that will help it to get the attention it needs. Get your friends to vote too :)

BradA [MS] (Expert):
Q: In a number of places, the framework appears to encourage unnecessary creation of new instances. Examples: iterators return a new instance; C# compound assignment operators deceptively create a new instance. Doesn't this violate your 'pit of success' idea?
A: Like Rico says you need to measure… In many many cases there is no performance issues with creating new instances… in fact memory allocation is WAY cheaper in the managed world than the unmanaged world. The result is that it is often a very good deal for you to use the productivity enhancements in the languages even though the create new instances. That said, you should of course measure and see where you are issues are..

Rico Mariani (Expert):
Q: What are good rules of thumb when deciding how often I should measure?
A: Measurements are one of the ways you use to control risk. You should be measuring in detail where there is muchrisk and roughly where there is little risk. Make sure you do enough homework to know which is which.

Joe Duffy [MS] (Expert):
Hey everyone. My name's Joe Duffy, and I'm also a PM on the CLR team. Looking forward to some tough questions! :)

Krzysztof Cwalina (Expert):
Q: I would like to know, which programming language is more powerfull compared with C# and vb .net
A: What do you mean by "powerful"? I think most languages are more powerful than other languages at *something* , otherwise they would not exist. VB.NET for example is optimized for productivity, C++ for expressiveness and interop with native code, etc.

Rico Mariani (Expert):
Q: Can you talk about improvements in warm start-up time in 2.0 vs 1.1?
A: We've done tons of work to increase the number of shared pages in our ngen'd images. That was a major focus. That directly translates to improvements in startup time. It's hard to quantify that generally, it will depend on your test case but that was in the cross-hairs of our rifle.

BradA [MS] (Expert):
Q: If I implement only Dispose() on all of my objects (no Finalize implemented), what is the impact on the over all performance, if I call GC.SuppressFinalize(this) in Dispose() method?
A: First, you should not depend on dispose always being called…that is a good way to create a resource leak. But if you for some reason call GC.SuppressFinalize() on a instance that is not in fact finalizable that is basically a no-op… is that your question?

Joe Duffy [MS] (Expert):
Q: What is the difference between GC.GetTotalMemory and Environment.WorkingSet? What is different in attempting to minimize each one?
A: GC.GetTotalMemory() returns the amount of memory currently thought to be managed by the CLR's garbage collector, while Environment.WorkingSet is the total memory consumption of the process. In most cases, these will be very close, but the working set will also reflect unmanaged memory outside of the GC's control. For example, the CLR itself uses memory that wouldn't be reflected by the GC's estimation.

BradA [MS] (Expert):
Q: Does anyone know of an excellent book on the topic of C# and .NET or perhaps have a favorite book they could not live without? I am a beginner to .NET and have Java, C/C++ experience.
A: Well, I will certainly give you my plug on my blog https://blogs.msdn.com/brada/archive/2005/02/21/377628.aspx

Rico Mariani (Expert):
Q: How to compute the size of a large composite object (like say, an XslTransform) ? Is there a better way than measure heap, collect, initialize object, collect, measure heap, subtract ?
A: You can use some of the features in the Call Tree view (especially the advanced filters) to get the total cost of allocations including temporary allocations. Set a start point in the call graph and a stop point and then the right pane can show you detailed decompositions between those two. I should write a tutorial on doing that :) But generally what you're doing is the way to do it.

Rico Mariani (Expert):
Q: Why CLR does not inline structs methods calls ?
A: There's no good reason for this... it's high on the list of weaknesses in the JIT. If I had 3 wishes for the JIT team they would be "Better inlining" "Better inlining" and "Better inlining". I believe I've been very clear with them on this point :)

Joe Duffy [MS] (Expert):
Q: is there any control that is equivalent to the data repeater control, in C#
A: What do you want to repeat? If you want to repeat the execution of some code for each item in (say) a collection, take a look at the new List's ForEach method. You can pass an Action delegate that gets called for every item in the collection. For example, myList.ForEach(delegate (Customer c) { Console.WriteLine(c.ToString()); });... I hope this is helpful?

BradA [MS] (Expert):
Q: Thanks Nick. Say I implemented Fianlize also and made sure that my code will call Dispose on all the objects it created, will it have performance improvement?
A: I assume this is about calling GC.SuppressFinalze() that call just sits a bit on the instance... it is a very cheap call itself...

BradA [MS] (Expert):
Q: Do you also take questions about compact framework?
A: While none of us work in the .NET Compact Framework and the implemenation is different many of the concepts are the same... so, if it is a concept kind of question, then yea, shoot.

Krzysztof Cwalina (Expert):
Q: In a WindowsForms app in which I create lots of controls dynamically, should I be adding them to the components collection so they get disposed?
A: I am not 100% sureabout that (you would have to ask WindowsForms experts) but, if the controls need to live for the whole lifetime of the application, then yes. If not, then you should just dispose them manually when you don't need them anymore.

Rico Mariani (Expert):
Hey guys, I'm madly answers questions as fast as I can but I thought I'd mention a little something. You have to remember I'm a "CLR Guy" which means I tend to work in the very lowest levels of the managed code system. That means I'm not especially a good person to ask about specific features in say Winforms (you'd want a winforms guy for that) or other higher level components. I'll do my best on any question I'm asked but sometimes I just gotta shrug and wish I knew more. OK back to the questions...

Rico Mariani (Expert):
Q: Are their and significant changes in the JIT that would cause us to want to change our 1.1 code when moving to 2.0, such as when something may get inlined?
A: Not too many changes to the inliner that I recall but there were other improvements. A lot of them had to do with getting more/better code/data sharing which can be very helpful. Better client application performance was a key theme.

BradA [MS] (Expert):
Q: Regarding the first question about parsing vs. boxing. What about using the Convert class? I find myself using Convert.ToBlah quite a bit. Is it better to cast instead?
A: As long as you are sure you are not boxing (you call the strongly typed overloads) these methods are highly likely to get inlined and thus no real perf issues between it and casting... but you should check with your language of choice as there may be differences between how your cast works compares to Convert.ToXxx()

Joe Duffy [MS] (Expert):
Q: In Java try catch blocks are slow. Is the same true in C#?
A: Brad just posted a blog entry that in part covers this... https://blogs.msdn.com/brada/. The short answer is: yes, they tend to be slow enough that you should only use them for truly exceptional cases. For example, we introduced new TryParse APIs on all of the primitive types so that you wouldn't have to call Parse, generate an exception, and catch it anytime somebody passed in an invalidly formatted string. You should try to take the same approach in your own code.

Rico Mariani (Expert):
Q: Refering back to my question on forcing CLR to release memory, someone suggested to use SetProcessWorkingSetSize(), I know this is not the preferred way, but is this acutally possible?
A: People have done this with mixed results. Generally I discourage the use of this API as it tends to be hard to predict the overall outcome. Consider that sort of approach if you have a case where your managed application is going to go dormant for a while and it's a good idea for the OS to swap out your state.

Joe Duffy [MS] (Expert):
Q: When returning a SQLConnection object to the connection pool, when should .Disposed be called, or is .Close enough?
A: As a general rule of thumb, anytime a type exposes a Dispose method and you are certain you're done with an instance of it, call Dispose. Most of the time Close is the same as Dispose, but with SqlConnection, there are actually subtle differences. I think they're subtle enough for you not to worry too much about it, but this is still a good habit to get into.

Rico Mariani (Expert):
Q: Has there been any work done on reducing the memory footprint of the CLR? I have a rather simple app (1 form, no db access) at it uses over 25mb of ram. Thats pretty high for something so simple.
A: Tons of work was done in this area. Better layout of the image files to reduce working set for instance and of course more sharable pages across multiple running applications. This was a big focus area. Of course you can still burn tons of memory if you allocate like crazy but we've worked hard on reducing the cost associated with the raw code.

Maoni [MS] (Expert):
Are there any plans to make improvments to the workstation garbage collector, specifically to take advantage of multiple processors. This has been somewhat of an issue for developing large winform based applications on citrix\terminal server environments.

Joe Duffy [MS] (Expert):
Q: (doh!)Back to my question on the diff. between GC.GetTotalMemory and Environment.WorkingSet, Joe said they shouldn't be that different. Well, my WinForms app returns 2 MB for GetTotalMemory and 45 MB for Environement.WorkingSet. Do we have a huge problem?
A: Sorry. I guess that wasn't entirely accurate. For application which don't allocate huge amounts of unmanaged memory and mostly use the managed GC heap, then they shouldn't be terribly different. In the case of WinForms, there's a lot of overhead, so you should expect them to be. 45MB sounds about right for the WinForms libraries and related allocations.

Maoni [MS] (Expert):
Q: Are there any plans to make improvments to the workstation garbage collector, specifically to take advantage of multiple processors. This has been somewhat of an issue for developing large winform based applications on citrix\terminal server environments.
A: Well, finally I could post something. We already have server GC which specifically takes advantage of multi proc. take a look at this: https://blogs.msdn.com/maoni/archive/2004/09/25/234273.aspx

BradA [MS] (Expert):
Q: Brad - Thanks! I use C#...and I was thinking specifically about ViewState["someInt"] and using Convert.ToInt32 on it.
A: Rico discussed that at the top of the chat... "Parsing is a total disaster compared to boxing. ToString will make a string then it will be parsed. Actually in that example there's just unboxing which is really cheap. Very few cases of boxing would ever be more expensive than making a temp string -- it's not *that* bad to box"

Rico Mariani (Expert):
Q: What is a cost of having methods virtual ? Will CLR do any hacks if it will find that nobody in current/running assemblies does not override thouse methods ?
A: It's hard to answer this question generally. If the method is fairly large the virtual call overhead is just not interesting, but for smaller methods it can be more so. The true costs tends to be indirect ones -- see https://blogs.msdn.com/ricom/archive/2004/08/24/219751.aspx for more details -- but in short being virtual tends to pervent inlining and that can be a burden on the small functions. It's hard to "hack" away virtualness because even if it looks like the class in question has no subtypes it's possible to dynamically load a subtype later or even create a subtype with reflection.emit so we can't assume. On the other hand if you seal functions or classes then that helps us. There are things that can be done even in the most general cases but it's not an easy problem by any means. See the article and the feedback, there was good discussion there.

Joe Duffy [MS] (Expert):
Q: for asp.net when doing a customized viewstate, is it wise to use an object array (which only contains text & bool values) as the return value. I have used this so far and been able to gain alot of control in viewstate size and slim my pages down.
A: An array or small struct is the best approach. Truthfully, if trimming your viewstate is super-important and you're storing large amounts of information, you should consider relying on session variables. You trade memory footprint on the server for smaller page-size/responsiveness for the user. The dial here depends entirely on your target audience and the capabilities/scaling of your web environment.

Rico Mariani (Expert):
Q: I suppose the call tree view you are referring to is in CLR Profiler. I've tried to use it on my app but it slows things down so much the app takes hours to startup. Any plans for supporting an "attach" mode for the profiler ?
A: Attach mode is a super popular request but a hard one because we profile by instrumenting the code as it is jitted. However consider this: you can start with profiling disabled, then use the CLR Profiler API to programmatically turn it on just at the perfect spot, and then turn it off again, that limits the log size and reduces the performance hit significantly. Most people with big call trees use CLR Profiler in this way.

Maoni [MS] (Expert):
Q: Is there a way to limit the amount of memory the CLR allocates for the process? - something like the -X min and max options for Java. If there is, what will happen when the max is reached?
A: You can use the hosting API: IGCHostControl::RequestVirtualMemLimit to communicate the memory situation to GC but this will simply throw OOM if the memory limit is reached. We don't want to give you option to specify the memory quota for standalone processes because we believe it's not a good thing for users to do - you don't have control over what kind of machines your app will be running on but hosts usually have a pretty good idea of that.

Rico Mariani (Expert):
Q: Thanks but I have read an article about C#, Java and a new language called D that claims that C# try catch blocks actually are faster than normal C# code.
A: C# try blocks are faster than normal C# code? I don't see how that's possible. I guess having trys is faster than having lots of if statements for error checking (see https://blogs.msdn.com/ricom/archive/2005/03/01/382756.aspx). If exceptions are used for truly exceptional cases then you do get a perf win there.

Krzysztof Cwalina (Expert):
Q: To what degree was performance of numerical applications considered in designing the CLR and the BCL? The assumptions that work for general line-of-business applications (eg. allocs are cheap) seem to break down often for computing-intensive code.
A: You are right. The majority of the BCL APIs are optimized to be a general purpose base library. Once you get to very ,I would say, specific applications (rather than "numerical application") the APIs may not be optimal. Having said that, we would love to get feedback on any issues with the BCL APIs that relate even to very specialized scenarios. In lots of cases it's possible to improve some specialized code path without hurting the general purpose scenarios.

Rico Mariani (Expert):
Q: Rico, if inlining is your #1, #2 and #3 performance issue with the JIT, what are #4 and #5?
A: Have a look a the last section of this article https://blogs.msdn.com/ricom/archive/2005/02/22/378335.aspx where current JIT weaknesses are discussed. Handling of valuetypes generally (beyond inlining) would be #4 and then probably registration in general after that.

Krzysztof Cwalina (Expert):
Q: Can we have a better perfmon counter for monitoring managed/ unmanaged transitions in .NET 2.0 than "# of marshalling", which is only an indirect reflection of transition count as not all params need marshalling
A: Nick, I take this as a feature request. It would be great if you could file a formal request at https://connect.microsoft.com/Main/content/content.aspx?ContentID=2220. Thanks.

Maoni [MS] (Expert):
Q: Do you guys use the CLR Profiler or do you have a different internal tool you use for profiling managed code?
A: Yes we do use the CLR Profiler internally for what CLRProfiler is good for.

Rico Mariani (Expert):
Q: Why does .Net seem to have so many page faults?
A: Hmmm... well there's page faults and then there's page faults. Some of what you see has to do with faulting in code as it is touched. Those faults tend to happen because you might access code a bit here and a bit there and it might be scattered such that it isn't amenable to pre-loading. That's something we've worked on pretty hard in this last release. On the other hand some of the faults, maybe even most, have to do with the heap. If that's the case then look to your object lifetimes -- make sure that most things are dying in gen0 and that there are few collects involving all the objects. See this article on "Mid Life Crisis" that's often a source of page faults. https://weblogs.asp.net/ricom/archive/2003/12/04/41281.aspx

Krzysztof Cwalina (Expert):
Q: How expensive are exceptions that are thrown and caught in the same method? Does the JIT do any optimizations there?
A: I don't think we have any special optimization for this case. In general exceptions, when thrown, are very expensive. You should not explicitly use them for flow control in a single method.

Maoni [MS] (Expert):
Q: Are there any improvements in speed for large object allocations? E.g. byte arrays of 200kB in size. Currently, they are very slow, even slower than C/C++ malloc.
A: I don't think it's slower than c/c++ malloc but if you can show me an example that would be good. People have complained to me about the fragmentation problems with LOH but so far no one has been able to actually show me a repro. And yes we do have perf improvements for LOH in Whidbey.

Rico Mariani (Expert):
Q:Q: Anyone have specific optimisation tips for Managed DirectX? Or just treat it like a really hungry system resource? (Specifically in regards to a hungry app like World Wind - worldwind.arc.nasa.gov)
A: My biggest advice here is to watch the IDisposable objects very very carefully. At high frame rates you could have a lot of churn there so anything that needs Disposing make sure it's getting Disposed. You don't want load on your finalizer thread and you don't want objects aging. Outside of that, budget, plan, verify :)

Joe Duffy [MS] (Expert):
Q: Is there a way to set the number of threads in a thread pool, or eliminate it completely?
A: You can set the max # of threads using the ThreadPool.SetMaxThreads API, yes, but unfortunately you can't eliminate it entirely.

Rico Mariani (Expert):
Q: If I create all my database connections in a using block, am I working around connection pooling? Are these connections being pooled? Maybe I'm stupid, but it is hard to tell whether a specific connection is pooled.
A: The using block is just going to do the Disposing of the resource for you automatically. When you dispose a connection it goes back to the pool. So you should be just fine with "using." No problems I can think of there.

Maoni [MS] (Expert):
Q: Is there any hacks in CLR (like a separe GC generation) for objects that implement IDisposable to finalise them as early as possible ?
A: No. IDisposable is just a language thing. Inside of GC itself we just recognize finalizable objects and we collect them following their rules.

Joe Duffy [MS] (Expert):
Q: Joe - DataSet doesn't implement dispose, but it is derived from an object that is. Is it a waste to wrap it in a using? And...Is the rule *really* call Dispose on anything IDisposable?
A: Yes, anything that implements IDisposable (or which has a class in its base hierarchy that does) should be disposed of when you're done with it. Otherwise, you're potentially hanging on to precious resources that should otherwise have been released earlier. These will normally get cleaned up inside their finalizers, but there's a cost with keeping an object around on the finalizer queue too. DataSet, for example, derives from MarshalByValueComponent... this base type does some work in its Dispose method, so calling it on DataSet is still worthwhile.

Rico Mariani (Expert):
Q: When refactoring code - are there performance penalties for over-'functionizing' an application?
A: There sure can be. One of the guys that works on profiling tools here says that he sees -- in object oriented languages -- this tendancy to write what he calls "work averse functions." What he means by this is that everything is so factored that there are many many functions each of which does very little and then passes it on to the next function. In straight C programmers don't tend to code that way. The work-averse functions translate directly into much deeper callstacks and much more function call and return overhead. Partly because they're in the worst of all worlds... too big to be inlined yet not big enough to be doing meaty amounts of work. So be careful. Factoring is good but don't go crazy -- you get oophalism that way :)

Maoni [MS] (Expert):
Q: Are there ways, as coders, to prevent fragmentation, or do we rely completely on the CLR for this? I know pinning objects may cause unavoidable fragmentation, but are there other things that may cause it the we have control over?
A: Yes, take a look the blog entry where I talk about pinning and what good techniques to use if you must pin objects. Another thing you can do is avoiding creating temporary large objects. We recommand using a large object pool where you can recycle the buffers.

Krzysztof Cwalina (Expert):
Q: When refactoring code - are there performance penalties for over-'functionizing' an application?
A: Yes. Definatelly. Too many short methods that do liitle work called deep on the callstack can significantly affect performance. You should be careful about overfactorizing methods that get called often (in deep callstak loops)

Rico Mariani (Expert):
Q: Thx Rico: So it would appear that there really isn't an optimal garbage collector for a multiprocessor/multi user client environment?(workstation GC=high memory/high gc time/improved UI responsiveness), server gc=improved GC & Memory/poor UI response)
A: We haven't found one GC that that fits all yet but we're still working on it. :)

Rico Mariani (Expert):
Q: What is best CLR tip everybody must know to improve their application performance ?
A: Measure your applications performance regularly. Learn to use the CLR Profiler. Think about object lifetimes as part of your design.

Rico Mariani (Expert):
Q: We're converting our VistaDB 2.0 core engine from unmanaged code to fully managed C# with CLR hosting for 3.0, and my guys have found string handling under .NET 1.1 to be ~2x slower than the unmanaged code. Any good web links to shed light on this?
A: Unmanaged systems often do in-place operations on strings. Make sure you're using stringbuilders in the right places so that you get the mutability you need. Have a look at these articles:

Maoni [MS] (Expert):
Q: How about separe/additional GC heap for pinnable objects (marked as such) to avoid main heap fragmentation ?
A: pinned objects could be holding onto non pinned objects and vice versa. If you have a separate heap for pinned object you'll still need to scan the rest of the heap so it's not really helping.

Rico Mariani (Expert):
I hit enter too soon there, I have some articles on strings coming right up

Rico Mariani (Expert):
Look here https://blogs.msdn.com/ricom/archive/2003/12/02/40778.aspx and here https://blogs.msdn.com/ricom/archive/2003/12/15/43628.aspx for string articles

Maoni [MS] (Expert):
Q: I'm seeing a strange behavior with mscorwks.dll. Somehow sometimes when I start our application or let it run for a while, the mscorwks.dll CoInitializeCor appears to use huge amount of CPU time (50%). Once that thread is in that state,
A: You mean after you run your app for a while, then you call CoInitializeCor and then you see the CPU time suddenly going up?

Maoni [MS] (Expert):
Q: Does LOH sweep occur everytime a Gen0 sweep occurs?
A: No. It's swept everytime a Gen2 GC occurs

Rico Mariani (Expert):
Q: What is a reason why all .NET 2.0 applications hog CPU and entire PC then closed (Dec CTP at least) ?
A: I guess the only advice I can give you is to try to file some specific bugs. There isn't supposed to be any "CPU Hogging" going on -- certainly I don't see it and I measure stuff every day. But that doesn't mean there isn't some problem that you've got there. Use ladybug to get us the most specific feedback you can for the best chance of getting whatever the problem is addressed. I wish I could be more helpful than that... sorry

Joe Duffy [MS] (Expert):
Q: If you are passing a portion of a byte buffer into an unmanaged function, is it worse to pin the object and pass an IntPtr into the unmanaged function or to allocate a new partial buffer array and pass that into the unmanaged function?
A: This depends on whether you want to pin the entire byte buffer or just the segment you are dealing with at the moment. I suspect its the former (otherwise you'd never know how much of your buffer was still around), in which case you'd be better off using an IntPtr & GC.KeepAlive or SafeHandle to account for your outstanding managed references. SafeHandles are preferable, but either will work.

frankred [MS] (Moderator):
We're just about out of time. So get your last questions in.

Rico Mariani (Expert):
Q: Do current CLR eleminate useless null pointer comparations used in paramer validation logic then several nested methods check the same variable ?
A: It's not so easy to do this because of course you want to share the code for the nested functions whether they are called directly or not. Generally there's a lot more than average argument validation that happens in the CLR Base Class Libraries because we want to give robust error messages. We get beat up if we give poor messages (like access violations) so we have checks at all the levels that might be called. But they come at some cost. I often haggle with that team about which ones are really worth the perf and which aren't. Sometimes I win sometimes I lose :)

Krzysztof Cwalina (Expert):
Q: I’d like to understand the CLR performance cost of using Interop assemblies in relation to or porting my VB COM to .Net assemblies.
A: If the interraction of the .net application with the native library though the interop layer in not chatty (no frequent calls), then it should not be a problem. On the other hand if the interaction is chatty (lots of call that do little work) then you should evaluat whether moving the interraction boundary (either way) would help performance.

Rico Mariani (Expert):
Q: More CLR tutorials would be supercallifragilistic. Particularly about profiling large applications - it always seems simple on a toy project...
A: I'll put it on my blog list. You're not the first to ask. In fact there's a internal workshop on "large managed applications" next week and they want a CLRProfiler tutorial... go figure :)

Maoni [MS] (Expert):
Q: Thanks for the links Maoni. FYI we have resorted to byte arrays and found them to be faster than StringBuilder. I'll check out those links.
A: StringBuilder has its weaknesses. Rico has a couple blog entries that talks about StringBuilder: https://blogs.msdn.com/ricom/archive/2003/12/02/40778.aspx; https://blogs.msdn.com/ricom/archive/2003/12/15/43628.aspx perhaps you'll find helpful.

Rico Mariani (Expert):
Q: Rico, on a clear XP Proff, VS 2005 Dec CTP devenv.exe takes 10-20 seconds to close and it does so using regular thread priority - so others apps does not any chanses to see CPU :-(
A: Ah hah, devenv shutdown issues huh. Well I can get that feedback to the right people.

Krzysztof Cwalina (Expert):
Q: Thanks Krzysztof Cwalina!
A: You are welcome.

Maoni [MS] (Expert):
Q: My question is what is going on inside CoInitializeCor What could be causing it to grab the cpu? Is there anything I can do to avoid the problem, or it could be a bug in the CLR?
A: That's hard to say without a callstack or anything. Off the top of my head I can't think of anything that it might be doing that needs much CPU but without more info I can't give you a definite answer.

Joe Duffy [MS] (Expert):
Q: What advice to you have to tracking down code that should call IDisposable.Dispose but doesn't? One thing that comes to mind is having the profiler grab and show a stack trace of the creation of a (later) finalized object.
A: You should check out FxCop--it looks for at least some of this. It warns you about types which have disposable fields, but which you don't dispose of yourself. This of course doesn't find cases where you just let an object go out of scope. This is harder to detect because you might have handed out a reference to your object. The new managed C++/CLI actually does a lot of this intelligently for you, injecting calls to Dispose when objects just fall out. Certainly doing profiling or asserts inside your finalizers is a great test/debug option to find objects which somehow slipped past getting dispose called, but you need to make sure that's only on debug builds... Relying on assert/profiling infrastructure from a finalizer is dangerous. In general, you should avoid touching any managed objects (other than "this") from your finalizer. Hope this helps.

frankred [MS] (Moderator):
Although we're almost out of time, a few of our experts have volunteered to stay and continue answering questions! :)

Maoni [MS] (Expert):
Q: Any plans in Widbey to get rid of the the double dll load for NGen & Gac'ed dlls?
A: It's already done.

Rico Mariani (Expert):
Q: What are your favorite books or articles on how or what to profile?
A: My favorite is Peter Sollich's video on Using the CLR Profiler -- see this posting for the links https://weblogs.asp.net/ricom/archive/2003/12/02/40782.aspx

Rico Mariani (Expert):
Q: I have read that NGEN (in MSDN magazine) will be much more effective in 2.0 vs. 1.1, where you might actually want to use it. Can you talk about this?
A: We've done TONS of work in ngen -- specifically to help reduce private pages in managed modules. More sharing means faster startup in warm cases. Better overall density means less overall I/O in cold cases. It's a good thing :)

Rico Mariani (Expert):
Q: Thanks for the chat, very useful info. Same with the internal training videos. Love em!
A: You're welcome

Joe Duffy [MS] (Expert):
Q: Any tips on inheritance best practices? How deep can the tree grow? Also, do any of your tips apply to CF NET as well?
A: You should check out one of our recent talks on this, covering mostly this very topic: https://msdn2.microsoft.com/netframework/aa497256.aspx. General advice is to keep your inheritence hierarchies as compact as possible, without sacrificing the extensibility/richness of your framework. CF is more resource constrained, so you should magnify your weighting in the direction of compact hierarchies versus richness.

Rico Mariani (Expert):
Q: I've seen a problem where an application has locked up because of a deadlock in the message processing in the framework. I was wondering if this has been reported before?
A: It might be but it's not the sort of thing I would be familiar with anyway. Look for entires in the "ladybug" database to see. If it's a common case it's likely that someone has run into it before but it's good to check and stress the importance. Ladybug is a great way to help us prioritize the right bugs.

Rico Mariani (Expert):
Q: Rico, I love your new sections on Channel 9. Maybe a 'languages with comments' section would be great too - for those costly complexities the compilers make look sooo easy.
A: Good idea... some comments on language structures would be fun to write.

Rico Mariani (Expert):
Q: Are there any plans for a more configurable ThreadPool?
A: There were improvements in the ThreadPool in version 2.0 which should help. But of course it depends on what sort of configuration you need. A lot of the issues revolve around throttling the threads so that they deal with peak loads better and then quiet down again. That was a common theme in complaints.

Joe Duffy [MS] (Expert):
Q: Is there any way to create thread priority inversion for object locks in .NET ? I.e. then hign priority thread lock on object owned by low priority one - then this low priority will get temporary boost ?
A: We generally don't make any guarantees around how locks held by threads interact amongst each other, especially with regards to the scheduling of them. Priority-based grants/acquisitions/boosts, lock leveling, reentrancy are all things we'd like to look at further in the future.

Rico Mariani (Expert):
Q: Cheers, Rico
A: :) thanks for coming

Krzysztof Cwalina (Expert):
I have to run. It was nice seeing so many people. Please join us at the next session.

Rico Mariani (Expert):
Q: IF caspol off is done on a machine, are all security checks eliminated. I'm trying to speed up a citrix server in a trusted environment.
A: The same runtime is used in all the cases so no they can't be *eliminated* but they can be a lot cheaper.

Rico Mariani (Expert):
Q: Does FullTrust automatically eliminate all code access checks
A: Same answer as with the other security question: the tests can't be eliminated because the same code is used for all the cases but of course full trust tests are much faster ("Q: does this code have access? A: heck yah") :)

Maoni [MS] (Expert):
Bye everyone. Thanks for the questions. Nice chatting.

Rico Mariani (Expert):
Q: Rico, in the latest blog entry you pointed to, I see no mention of a video by Peter Sollich, only his CLR doc ?
A: Gah! That's the wrong video. Tell you what I'll dig it up and post a link to it on my blog soon. Sorry about that. I do hope we made his video public -- I know I have it internally. If not, the video is just a nice walk through of the CLR Profiler documentation. The docs are a thorough tutorial of every window and view. Check them out, it's great info on the GC even if you never use the tool.

Rico Mariani (Expert):
Q: Rico: nevermind, google found the video. https://msdn.microsoft.com/msdntv/episode.aspx?xml=episodes/en/20050217CLRPS/manifest.xml
A: *writes that down*

Rico Mariani (Expert):
Q: Can we get a way to save the chat content in a structured manner? Instead of the copy/paste?
A: We will post a transcript probably within a week or so. Check back on the msdn site. I'm sure I'll post a link when it is available. Thanks for coming!

Rico Mariani (Expert):
Q: Any chance there is support for multiple ThreadPools? If I have 3 or 4 distinct operations, it would be nice not to force all of them to work from the same pool
A: You might want this less with the new ThreadPool logic, it's better about growing to the threads it needs on demand and then shrinking. Really you don't *want* lots of pools do you? I mean it would be better if one pool just did the right thing already without more hints to prevent starvation. Well that's what I think anyway, who needs the grief of many pools :)

Rico Mariani (Expert):
Q: Thanks for your time.
A: You're welcome, thanks for coming :)

Rico Mariani (Expert):
Q: Thanks! you've been most helpful!
A: You're welcome :)

Rico Mariani (Expert):
Q: Thanks a lot Rico, see you on your blog !
A: Thanks for reading :)

Rico Mariani (Expert):
Whew

Rico Mariani (Expert):
ok I think that's it

Rico Mariani (Expert):
I think there's a few questions that are still being answered

Rico Mariani (Expert):
Maoni has one or two but that's mostly it

Rico Mariani (Expert):
thanks for coming!

Rico Mariani (Expert):
my fingers are sore :)

Rico Mariani (Expert):
thank you

Rico Mariani (Expert):
have a great day everyone

Top of page

Share via

Designing .NET Class Libraries: CLR Performance Tips (March 09, 2005)

Additional resources