April 2012

Volume 27 Number 04

T4 Templates - Lowering the Barriers to Code Generation with T4

By Peter Vogel | April 2012

The Microsoft .NET Framework makes extensive use of code generation both at design time (when dragging a control onto a design surface generates code) and at run time (when LINQ generates the SQL statements that retrieve data). Code generation obviously makes developers more productive by reducing the amount of code a developer has to write, but it can be especially useful when more-or-less identical code is used in many solutions. As you implement this similar (but not identical) code in a new application, it’s all too easy to introduce new errors.

Even though developers can take advantage of all of the code-generation tools the .NET Framework uses in creating applications, very few make extensive use of code generation in their day-to-day development practice. There are a number of reasons for this: they worry that incorporating code generation requires a new toolset they don’t have the time to learn; they lack experience in recognizing problems that code generation will solve and in designing solutions that integrate generated code with “handwritten” code; and they understand that some code-generation solutions might require more time to maintain (or even to use) than will be saved by applying the solution.

The Microsoft Text Template Transformation Toolkit (T4) addresses many of these issues, providing a simple way to implement code-generation solutions that leverage tools and techniques developers are already comfortable with. T4 does this by recognizing that a typical solution involves two types of code: boilerplate code that doesn’t change from one code generation to another, and dynamic code that does change. T4 simplifies code generation by allowing developers to simply type the boilerplate code portion of a solution into a file. In T4, the dynamic code (typically a small part of a code-generation solution) is generated using a set of tags that look very similar to the tags an ASP.NET MVC developer would use in creating a View, or that an ASP.NET developer would use to embed server-side code in an .aspx file.

Using a T4-based solution leverages the skills you already have by letting you specify the inputs for generating code in whatever programming language you’re already using. T4 doesn’t generate code-neutral solutions (the code generated by a T4 solution is always in some specific programming language), but most developers don’t need code-neutral solutions.

Defining Code-Generation Solutions

While T4 makes it easier to create code-generation solutions, it doesn’t address the problems developers have in recognizing when a code-generation solution would be useful and in actually designing a solution. Problems that code generation solves generally share three characteristics:

  1. The generated code goes into a separate file from the devel­oper’s code. This ensures the code-generation process won’t interfere with the developer’s code. Typically, this means the generated code goes into a new class that the developer will use from his “handwritten” code. It might make sense to generate a partial class that the developer can not only call but extend—but the developer’s code and the generated code are still kept in separate files.
  2. The generated code is repetitive: The code in the solution is a template that can be repeated many times, often with small variations. This ensures the code generation is simpler and easier to maintain than the equivalent handwritten code.
  3. Compared to the handwritten solution, the generated solution requires far fewer inputs (ideally, no inputs at all—the code-generation solution determines what needs to be done from the environment). If the number of inputs is large or hard to determine, then developers might regard the handwritten solution as simpler than the generated solution.

Given these characteristics, three types of scenarios are worth investigating for code generation. Developers tend to focus on the first scenario (“the ultimate scenario”), in which just a few inputs are used to generate a lot of code that’s used frequently (think of the Entity Framework, for instance). In fact, the most fruitful code-­generation opportunities fall into two other scenarios.

The second scenario is the most obvious one: When a great deal of code needs to be generated. In that case, avoiding writing multiple lines of repetitive code is clearly beneficial. While most code-generation solutions are used in multiple apps, solutions in this category can be worthwhile even if used in only a single application. Rather than write generalized code with many If statements to handle each situation—each If statement doubling the logical complexity of the code—a code-generation solution can generate the specific code required for each set of conditions. Instead of hand coding many classes that share an interface, code generation can be used to create the individual classes (assuming the classes share a common structure).

But the third type of scenario is the most common: When a few inputs will generate a little bit of code to be used in many applications. In this scenario, the amount of repeated code in any particular application is small but the activity the code supports is so common that the solution ends up generating a great deal of code—just not in any one application.

For example, here’s some typical code that ADO.NET developers write all the time:

string conString =
  System.Configuration.ConfigurationManager.
    ConnectionStrings["Northwind"].ConnectionString;

Though this is a trivial amount of code, it’s code that’s repeated—with just the name of the connection string changing—in application after application. Moreover, in the absence of any IntelliSense support for the connection string name, the code is error-prone: it’s open to “spelling counts” errors that will be discovered only at run time (probably when someone who has input to your appraisal is looking over your shoulder). Another example is implementing INotifyPropertyChanged, which leaves the developer open to “spelling counts” errors on each property.

This code for retrieving a connection string named Northwind would be more useful if some code-generation solution existed for creating a ConnectionManager class for each connection string, like this:

string conString = ConnectionManager.Northwind;

Once you recognize an opportunity for a code-generation solution, the next step is to write out a sample of the code that the solution would generate. In this case, the ConnectionManager class might look like this:

public partial class ConnectManager
{
  public static string Northwind
    {
      get
        {
          return System.Configuration.ConfigurationManager.
            ConnectionStrings["Northwind"].ConnectionString;
        }
    }
}

This code meets the criteria for a code-generation solution: it’s repetitive (the property code repeats for each connection string) with only small changes (the name of the connection string and the property name) and the number of inputs is small (just the names of the connection strings).

Your First Code-Generation Template

A T4 solution can consist of a “code-generation package”: a file where the developer inserts the inputs to the code-generation process and a template file that generates code from those inputs. Both files are T4 template files and are created using the same programming tools you use to write your applications. This design allows you to separate your code-generation template from the file the developer uses to provide the inputs to the process.

To begin creating your code-generation solution, you must add the T4 template file that will generate your code to an application where you can test it—preferably, an application similar to the ones you expect your solution to be used in. To add the T4 template file, in the Add New Item dialog in Visual Studio, shown in Figure 1, add a Text Template specifying a name appropriate for your code-generation template (for example, ConnectionManagerGenerator). If your version of Visual Studio doesn’t have the Text Template option, add a new Text File (also found in the General section), giving the file the extension “.tt” to trigger T4 processing. If you do add a Text file you’ll get a warning message that you can safely ignore.

Adding a T4 Template
Figure 1 Adding a T4 Template

If you examine the properties for your new template file, you’ll find that its Custom Tool property has been set to TextTemplatingFileGenerator. This custom tool is run automatically by Visual Studio and is the host that manages the code-generation process. In T4, the contents of the template file are passed to the code-generation host, which puts the resulting generated code in the template file’s nested child file.

If you’ve added a Text file to your project, your template file will be empty; if you were able to add a Template file, it will contain two T4 directives, marked with <#@...#> delimiters (if you added a Text file, you’ll need to add these directives). These directives specify the language that the template will be written in (not the language for the generated code) and the extension for the child file. In this example, the two directives set the programming language for the template to Visual Basic and the file extension for the child file containing the generated code to .generated.cs:

<#@ template language="VB" #>
<#@ output extension=".generated.cs" #>

To create the traditional “Hello, World” application, just add the code to the template file (notice that while the language the template is being written in is Visual Basic, the template is generating C# code):

public class HelloWorld
{
  public static string HelloWorld(string value)
  {
    return "Hello, " + value;
  }
}

This sample uses only boilerplate code. In T4, boilerplate code is copied straight from the template into the code file. In Visual Studio 2010, that should happen when you switch away from the template file or save it. You can also trigger code generation either by right-clicking on the template file in Solution Explorer and selecting Run Custom Tool from its context menu or by clicking the Transform All Templates button at the top of Solution Explorer.

After triggering generation, if you open the template’s code file (which will now have the extension specified in the template’s output directive), you’ll find it contains the code specified in your template. Visual Studio will also have done a background compile of your new code, so you’ll find you can use the generated code from the rest of your application.

Generating Code

Boilerplate code isn’t sufficient, however. The ConnectionManager solution must dynamically generate a property for each connection string the application requires. To generate that code, you must add control code to manage the code-generation process and some variables that will hold the inputs from the developer using your code-generation solution.

The ConnectionManager uses an ArrayList (which I’ve called Connections) from the System.Collections namespace to hold the list of connection strings that form the input to the code-generation process. To import that namespace for use by code within your template, you use the T4 Import directive:

<#@ Import Namespace="System.Collections" #>

Now you can add any static code that begins your generated class. Because I’m generating C# code, the initial code for the ConnectionManager looks like this:

public partial class ConnectionManager
{

I must now add the control code that will dynamically generate the output code. Code that controls generation (code that is to be executed, rather than copied to the child file) must be enclosed in the <#...#> delimiters. In this example, to make it easy to distinguish between the control code and the code being generated, I’ve written the control code in Visual Basic (this is not a requirement of the code-generation process). The control code for the ConnectionManager solution loops through the Connections collection for each connection string:

<#
  For Each conName As String in Connections
#>

In addition to any control code in your template, you’ll also need to include any expressions whose values are to be incorporated into your dynamically generated code. In the ConnectionManager solution, the name of the connection string has to be incorporated into the property’s declaration and into the parameter passed to the ConnectionStrings collection. To evaluate an expression and have its value inserted into the boilerplate code, the expression must be enclosed in the <#=…#> delimiters. This example dynamically inserts the value from the conName variable into two places in the static code inside the For Each loop:

public static string <#= conName #>
{
  get
  {
    return System.Configuration.ConfigurationManager.
      ConnectionStrings["<#= conName #>"].ConnectionString;
  }
}
<#
  Next
#>
}

All that’s left is to define the ArrayList that will hold the list of connection string names. For this, I’ll use a T4 class feature. T4 class features are typically used to define helper functions but can also be used to define fields or any other class-level items that will be used during the code-generation process. Class features must appear at the end of a template, as this one does:

<#+
  Dim Connections As New ArrayList()
#>

This T4 template forms the first part of the ConnectionManager solution—the code-generation template. You now need to create the second part of the solution: The input file that the developer will use to provide the inputs to the code-generation process.

Using the Code-Generation Package

To provide a place for the developer to enter the inputs to the code-generation process, you add a second T4 template to the application in which you’re testing your code-generation solution. This template must have an Include directive that copies your code-generation template into this template. Because I named my code-generation template file ConnectionManagerGenerator, the input template file for the ConnectionManager solution looks like this:

<#@ template language="VB" #>
<#@ output extension=".generated.cs" #>
<#@ Import Namespace="System.Collections" #>
<#
#>
<#@ Include file="ConnectionManagerGenerator.tt" #>

When code generation is performed, the host process actually assembles an intermediary .NET program from the directives, control code and static code specified in your templates, and then executes the resulting program. It’s the output from that intermediary program that’s poured into your template’s child file. The result of using the Include directive is to merge your code-generation template (with its declaration of the Connections ArrayList) with the contents of this file to create that intermediary program. All the developer using your solution has to do is add the code to this template that will set the variables used by your code-generation template. This process allows developers to specify the inputs to the code generation using the programming language they’re used to.

For the ConnectionManager solution, the developer needs to add the name of the connection strings listed in the application’s app.config or web.config file to the Connections ArrayList. Because these settings are part of the control code that needs to be executed, that code must be enclosed within the <#...#> delimiters. The developer’s code must also precede the Include directive so that the variables are set before your code executes.

To generate a ConnectionManager for two connection strings called Northwind and PHVIS, the developer would add this code to the input template before the Include directive:

<#
  Me.connections.Add("Northwind")
  Me.connections.Add("PHVIS")
#>
<#@ Include file="ConnectionManagerGenerator.tt" #>

You now have a code-generation package that consists of the code-generation template file and the input template file. Developers using your solution must copy both files into their application, set the variables in the input file, and close or save the input file to generate the solution code. An enterprising code-generation developer could set up the code-generation package as a Visual Studio item template that includes both template files. While not appropriate to the ConnectionManager solution, if a developer needs to generate another set of code based on different inputs, he would just need to make a second copy of the input file to hold the second set of inputs.

There’s one wrinkle in this solution’s structure: Any application that uses this solution will have both the input template and the code-generation template. In the ConnectionManager solution, if both templates generate code that Visual Studio compiles, the two resulting code files will both define a class called Connection Manager and the application won’t compile. There are a number of ways of preventing this, but the simplest way is to alter your code-generation template so that its generated code file has an extension that Visual Studio won’t recognize. Changing the output directive in the code-generation template file does the trick:

<#@ output extension=".ttinclude" #>

Code-Generation Tools and Resources

Besides the MSDN Library pages, your best source for information on using T4 is Oleg Sych’s blog at olegsych.com—I’ve certainly come to depend on his insights (and tools) in developing my own code- generation solutions. His T4 Toolbox includes several templates for developing T4 solutions (including a template for generating multiple output files from a single T4 template and other tools for managing the code-generation process). Sych’s toolkit also includes packages for several code-generation scenarios.

Visual Studio essentially treats T4 templates as text files—which means you don’t get IntelliSense support or highlighting or, really, anything that developers expect from an editor. In Visual Studio Extension Manager, you’ll find several tools that will enhance your T4 development experience. Both Visual T4 from Clarius Consulting (bit.ly/maZFLm) and T4 Editor from Devart (bit.ly/wEVEVa) will give you many of the features you take for granted in an editor. Alternatively, you can get the T4 Editor (either the free or PRO EDITION) from Tangible Engineering (see Figure 2) at bit.ly/16jvGY, which includes a visual designer you can use to create code-generation packages from Unified Modeling Language (UML)-like diagrams.

The Default Editor for T4 Template Files (Left) Isn’t Much Better than NotePad—Adding the Tangible Editor (Right) Gives You the Kind of Features You Expect in Visual Studio
Figure 2 The Default Editor for T4 Template Files (Left) Isn’t Much Better than NotePad—Adding the Tangible Editor (Right) Gives You the Kind of Features You Expect in Visual Studio

As with any other code-based solution, it’s unlikely your solution will work the first time. Compile errors in your template control code are reported in the Errors list after you select Run Custom Tool from a template file’s context menu. However, even if your control code compiles, you might find that the template’s child file is empty except for the word ErrorGeneratingOutput. This indicates that the control code in your code-generation package is generating an error when it executes. Unless your error is obvious, you’re going to need to debug that control code.

To debug your generation package, you must first set the debug attribute on the template directive to True, like this:

<#@ template language="VB" debug="True"#>

You can now set a break point in your control code and have Visual Studio respect it. Then, the most reliable way to debug your application is to start a second version of Visual Studio and, from the Debug menu, select Attach to Process. In the resulting dialog, you select the other running copy of devenv.exe and click the Attach button. You can now return to your original copy of Visual Studio and use Run Custom Tool from your input file’s context menu to start executing your code.

If that process doesn’t work, you can trigger debugging by inserting this line into your template’s control code:

System.Diagnostics.Debugger.Break()

With Visual Studio 2010 on Windows 7, you should add this line before your call to the Break method:

System.Diagnostics.Debugger.Launch()

When you execute your template code, this line brings up a dialog box that lets you restart Visual Studio or debug it. Selecting the debug option will start a second copy of Visual Studio already attached to the process running your template code. Your initial instance of Visual Studio will be disabled while you’re debugging your code. Unfortunately, when your debugging session is over, Visual Studio will stay in that disabled mode. To prevent this, you’ll need to alter one of the Visual Studio settings in the Windows Registry (see Sych’s post on debugging T4 at bit.ly/aXJwPx for the details). You’ll also need to remember to delete this statement once you’ve fixed your problem.

Of course, this solution still counts on the developer entering the names of the connection strings correctly into the input file. A better solution would have ConnectionManager include code that reads the connection strings from the application’s config file, eliminating the need for the developer to enter any inputs. Unfortunately, because code is being generated at design time rather than at run time, you can’t use the ConfigurationManager to read the config file and will need to use the System.XML classes to process the config file. You’ll also need to add an assembly directive to pick up those classes, as I did earlier to get ArrayList out of System.Collections. You can also add references to your own custom libraries by setting the assembly directive’s name attribute to the full path to your DLL:

<#@ assembly name="C:\PHVIS\GenerationUtilities.dll" #>

These are the essentials for adding code generation to your toolkit and increasing your productivity—along with your code’s quality and reliability. T4 makes it easy for you to get started by letting you create code-generation solutions using a familiar toolset. The biggest problem you’ll face in using T4 is learning to recognize opportunities for applying these tools.


Peter Vogel  is a principal in PH&V Information Services. His last book was “Practical Code Generation in .NET” (Addison-Wesley Professional, 2010). PH&V Information Services specializes in facilitating the design of service-based architectures and in integrating .NET technologies into those architectures. In addition to his consulting practice, Vogel wrote Learning Tree International’s SOA design course, which is taught worldwide.

Thanks to the following technical expert for reviewing this article: Gareth Jones