Thursday, November 12, 2009

XmlSerializers, ModuleVersionId, ILMerge, and You

I solved a minor mystery today.

Gallio internally uses the .Net XmlSerializer class to load plug-in metadata and save reports.  You probably know all about XmlSerializer already, but maybe you don’t know just how badly using it can impact application start-up performance.

About XmlSerializer Code Generation

The heart of the issue is that XmlSerializer uses code generation to perform serialization and deserialization.  Specifically, XmlSerializer generates some fresh C# code, compiles it out of process with csc.exe, then loads the resulting assembly.  This work takes time.  Seconds… feels like forever to a user.  The generated assemblies are not cached across runs so there is a significant performance penalty on every launch.

There is a way to avoid this cost: pre-generate serializers assemblies and distribute them with the application.

To pre-generate a serializers assembly, you are supposed to use SGen.exe a bit like this:

SGen.exe /assembly:MyAssembly /type:MyRootXmlType

This command will emit MyAssembly.XmlSerializers.dll.  All you need to do then is copy it next to MyAssembly.dll and you’re done.  The cost of code generation is gone from every launch.

Two problems:

  1. SGen only lets you specify one root type.  If you use multiple Xml document types in your assembly then you will need to write a custom tool like this: Gallio.BuildTools.SGen.
  2. Make absolutely sure that you keep MyAssembly.dll and MyAssembly.XmlSerializers.dll in sync.  If you change MyAssembly.dll in any way then you must regenerate the serializers assembly.  This is easy to set up once in your build scripts and forget about.


So let’s say you do all of this work and you run your program and you still see csc.exe starting up while your program runs.

To find out why, add the following to your application’s config file.  (eg. MyApp.exe.config)


            <add name=”XmlSerialization.PregenEventLog” value=”1” />


Then run your program again and look at the Windows Event Log.  There should be some information in there to help you out.  (Another thing to try is fuslogvw.exe.)

Here’s the message that I saw:

Pre-generated serializer ‘Gallio.XmlSerializers’ has expired. You need to re-generate serializer for ‘Gallio.Runtime.Extensibility.Schema.Cache’.

This message is telling me that I violated rule #2 above: the serializers assembly must always be kept in sync with its parent assembly!

Why?  I’ve got fancy build scripts…


XmlSerializer verifies that the serializer assembly is in sync with its parent assembly by comparing the module version id of the parent assembly with an id that it previously cached when it generated the serializer assembly.

Let’s check whether the assemblies are in sync using Reflector.

Gallio.dll is the parent assembly, it has a module version id of “fbe34432-817a-46dd-832f-3e5bc679ecff”.


Gallio.XmlSerializers.dll is the serializers assembly, it has a special assembly-level attribute called XmlSerializerVersion that was emitted during code generation.  It expects the parent assembly id to be “3c420916-3f3c-45cb-a79c-9a4bbc81014a".

Pedantic note: An assembly can have multiple modules each and each module has its own id.  So the XmlSerializerVersion attribute actually contains a comma-delimited list of all module version ids of the parent asssembly sorted in increasing order.


Ok, so XmlSerializer thinks the pre-generated serializers assembly is out of sync because the module id is out of sync.  That seems bananas because my build scripts always regenerate Gallio.XmlSerializers.dll whenever Gallio.dll is recompiled.


Well, my build scripts do one more thing: they run ILMerge on Gallio.dll to internalize certain dependencies that might otherwise get in the way of running unit tests.  (Bad things would happen if your unit testing tool exposed a dependency on one version of Mono.Cecil.dll whereas your tests depended on a different version.)

When ILMerge internalizes dependencies, it has to regenerate the assembly.  As a result, the new freshly merged Gallio.dll gets a brand new ModuleVersionId.  Unfortunately we pre-generated Gallio.XmlSerializers.dll before running ILMerge so it is now out of sync.

All we need to do is regenerate Gallio.XmlSerializers.dll after the ILMerge.  Problem solved.


Unknown said...

Hi, I am having problem migrating my code from MbUnit 2 to MbUnit 3. Obviously, MbUnit 3 does not have exception handler like
catch(MbUnit.Core.Exceptions.AssertionException e){throw e;}
what's the corresponding exception assertion in MbUnit3?

Jeff Brown said...

Please use the mbunit-user mailing list for questions like this.

The new type in MbUnit v3 is Gallio.Framework.Assertions.AssertionFailureException.

However, it is generally a bad idea to catch assertion exceptions. In some cases the assertion may already have been logged so catching the exception still results in a failure being seen in the report. What are you trying to test? (Please reply on mbunit-user.)

markell said...

Hi Jeff. Do you know about the PostSharp project? It is a very cool thing. Anyway, the reason I mention is because the PostSharp author - Gael Fraiteur has created a custom build task GenerateXmlSerializers which I find much more convenient then invoking SGen. We have the source code of PostSharp so I took the liberty of reusing this task for our purposes. Naturally, when the source code be licensed I will have to obtain a suitable license from Gael. Anyway, I would like to post an msbuild sample code to show how the task is used - where can I do so?

Jeff Brown said...

PostSharp is indeed very cool. :)

Why not start your own blog and post the sample there? I'll bet you have more things you'd like to talk about. Alternately, you could post the sample to or a similar site.

markell said...

Well, I do have many things I would like to share, but alas! My time does not belong to me. Not now, at least...