The black box
Compilers used to be a black box and we did not have access to the information inside it; that came with drawbacks, companies providing tools had to create their own version of compiler that simulates the exact behavior of the actual one to be able to provide capabilities like colorizer, go to definition, refactoring etc. in the IDEs, Furthermore, if developers were creating some frameworks or tools, there were no proper way to enhance the capabilities of the compiler to provide meaningful diagnostics, or code suggestions for the users of the framework or library (other developers using those frameworks or libraries). at least not easily.
In order for the compiler to compile our source code, there are a compiler pipeline consists of several phases, each of which provides an output for the next phase; Parser parses the source code and tokenize it into syntax tree that follows the grammar of the language. Declaration phase is responsible to analyze source and imported metadata to form named symbols.Binder matches identifiers in the code to symbols, and last but not least, Emitter emits an assembly with all the information accumulated by the compiler from all phases.

Roslyn opened the box by providing several APIs (Compiler API, Diagnostic API, Scripting API and Workspace API) and gives us the capability not only to access the informations collected in each phase, but to enrich each phase with information based on our own requirements and analysis, like providing specific diagnostics, whether it is warning or error, or providing code suggestions, even generating source codes during a compilation and adding them to the source of the program. Following shows the relation between Compiler API, one of the four APIs of Roslyn and each phase in the compiler pipeline.

Syntax Tree API provides access to the Syntax Tree, Symbol API exposes a hierarchical symbol table, Binder exposes the result of compiler's semantic analysis, and finally, Emit API produces IL byte code. Next, we want to use these APIs, to create a program, accepting a text as an input, and then compiling it to a .NET assembly.
Build Code with Code
Needless to say, the first step is adding a reference to the NuGet package which provides us access to the Compiler API, Microsoft.CodeAnalysis.CSharp
<PackageReference Include="Microsoft.CodeAnalysis.CSharp" Version="4.12.0"/>
The next step in this process is to create a syntax tree, consider there is a string
variable representing our code, we could pass that to the SyntaxFactory.ParseSyntaxTree
, another alternative is to use CSharpSyntaxTree.ParseText
const string sourceCode = """
using System;
namespace BuildingCode;
public class Program
{
static void Main(string[] args)
{
Console.WriteLine("Hello Sir Ta Piaz!");
}
}
""";
// 1. Create a Syntax Tree
var syntaxTree = SyntaxFactory.ParseSyntaxTree(sourceCode);
After creating a syntax tree, we could create a compilation, keep in mind, each compilation very likely needs to reference a other assemblies to be successful, in our case System.Private.CoreLib
, System.Console
and System.Runtime
are required.
var coreLib = typeof(object).Assembly;
var console = typeof(Console).Assembly;
var systemRuntime = AppDomain.CurrentDomain.GetAssemblies().First(a => a.GetName().FullName.Contains("System.Runtime"));
// 2. Create a Compilation
var compilation = CSharpCompilation.Create(
assemblyName: "BuildingCode",
options: new CSharpCompilationOptions(OutputKind.ConsoleApplication),
syntaxTrees: [syntaxTree],
references:
[
MetadataReference.CreateFromFile(coreLib.Location),
MetadataReference.CreateFromFile(console.Location),
MetadataReference.CreateFromFile(systemRuntime.Location)
]
);
Now that we have formed the compilation, it is time to tell the last phase to emit the assembly; This usually leads to have a file (FileStream
), but in our example for the sake of simplicity we are saving it somewhere in the memory, using MemoryStream
:
// 3. Emit the Assembly
using var ms = new MemoryStream();
var result = compilation.Emit(ms);
if the result of the whole pipeline is successful, we will have our assembly! To examine now whether our program compiled successfully and provided an executable, lets load the compiled assembly and invoke its entry point (since it is a console application):
var assembly = Assembly.Load(ms.GetBuffer());
assembly.EntryPoint?.Invoke(null, BindingFlags.Static | BindingFlags.Public, null, [null], null);
Diagnostic API
Diagnostic API is another api from the platform, hand in hand with the Compiler API, it could provide additional warnings, errors, or even code suggestions, also we could leverage it to access the diagnostics provided by the compiler. Its focus is on providing us with warnings, errors, code suggestions, etc. It ensure that the code adheres to the defined code standards and other rules defined by the development team, the following code snippet shows how we could access those information:
foreach (var diagnostic in result.Diagnostics)
{
Console.ForegroundColor = diagnostic.Severity == DiagnosticSeverity.Error
? ConsoleColor.Red
: ConsoleColor.Yellow;
Console.Error.WriteLine(diagnostic);
Console.ResetColor();
}
Hint: Remove one of the referenced assemblies from the compilation and try running the application, you should see some errors:
(9,17): error CS0012: The type 'Object' is defined in an assembly that is not referenced. You must add a reference to assembly 'System.Runtime, Version=9.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a'.
(9,9): error CS0012: The type 'Decimal' is defined in an assembly that is not referenced. You must add a reference to assembly 'System.Runtime, Version=9.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a'.
Conclusion
Roslyn provides us with a set of APIs (Compiler API, Diagnostic API, Scripting API and Workspace API) helping us to have more meaningful interaction with the compiler and enabling us to use the same APIs that are used in the compiler to create our own tools and analyzers.
Thanks for reading so far, enjoy coding and Dametoon Garm [1] 😊
Related YouTube Video
Resources
- You could find the source code on GitHub
- Read more about The .NET Compiler Platform SDK
- Understand the .NET Compiler Platform SDK model