Write your own Code Analyzer with Roslyn

Introduction

We are all familiar with diagnostics that are provided from the compiler when we develop an application, they could be in form of warnings, errors or code suggestions. A diagnostic or code analyzer, inspects our open files for various metrics, like style, maintainability, design, etc. However, sometimes we need to write a tailor-made analysis for our specific situation, tool, or project, this enables us to shorten the feedback loop for our developers to typing time 😃

Roslyn provides us with tools to create custom diagnostics called analyzers that targets C# language, it could be any violation that is detected in our code whether it is about code styles or specific way of using a library. This inspection happens when the files are open and during design time. There are generally two types of analyzers, Syntax Analyzer and Semantic Analyzer, the way they are implemented are the same, the only difference is to which actions we are subscribing to get notified. We will talk about them later on, but for now be aware that the example in this post is a Semantic Analyzer.

The Problem Space

In our team we had decided to use the Maybe type where we are returning a value or nothing; this is a special type from functional programming, that represents either the existing of a value or nothing. You might have heard it by other names like Optional, Option. What we've observed in some of our projects is that sometimes instead of returning None, developers used throw exception.which was something we wanted to avoid. Since it happened more often than not, we decided to write a code analyzer to prevent this at design time (when the developer is writing the code). Consider the following code snippet as an example:

public Maybe<Customer> Find(Guid customerId)
{
    var valueExists = Customers.TryGetValue(customerId, out var customer);
    if (!valueExists)
        throw new InvalidOperationException($"Customer does not exists, CustomerId:{customerId}");

    return Maybe.Some(customer!);
}

We want that our analyzer draws some red or orange squiggles under the throw statement and indicates that this pattern is not valid and it should use the following code instead, in this post we will focus on the former part, and in another post we will explore Code Fix Providers covering the latter.

public Maybe<Customer> Find(Guid customerId)
{
    var valueExists = Customers.TryGetValue(customerId, out var customer);
    if (!valueExists)
        return Maybe.None;
    return Maybe.Some(customer!);
}

Understanding the solution

To understand how we want to implement our diagnostic, it is good to first take a look into its syntax tree and analyze it first to see what kind of information we could gain, If you are not familiar with sharplab.io, check out my post about creating syntax trees, or watch this video.

If we copy the above-mentioned code and click on the throw keyword on the right side panel you should see the SyntaxNode representing this statement: ThrowStatement

The yellow-green selected area show the syntax node we are interested in, and the purple area is the IOperation which we need to subscribe an action to it to run at completion of its semantic analysis. We will see it in action later.

Next, we need to traverse the tree upward until we get to a syntax node of type MethodDeclarationSyntax and examine its return type; it should be a Maybe<T> which T could be anything, like Maybe<Customer>, however what is important for us is its original definition Maybe<T>.

After getting the return type we just need to compare its ITypeSymbol with the type symbol (INamedTypeSymbol) representing the original Maybe<T> and if they were the same types then we tell the Diagnostic API to report a warning.

INamedTypeSymbol is inherited from ITypeSymbol and represents types other than array, pointer, or a type parameter [1]

Now that we have some understanding of what we want to do, let's create our analyzer. 😃

Creating the Analyzer

The first step as always is to create a project to host the analyzer we want to create, depending on IDE of choice, there might be different templates, at the end, the .csproj file should look somehow similar to the following, in which the TargetFramework should be netstandard2.0 and <IsRoslynComponent> should be true :

<Project Sdk="Microsoft.NET.Sdk">

    <PropertyGroup>
        <TargetFramework>netstandard2.0</TargetFramework>

    <!-- skipped for clarity  -->

        <IsRoslynComponent>true</IsRoslynComponent>
        <EnforceExtendedAnalyzerRules>true</EnforceExtendedAnalyzerRules>
    </PropertyGroup>

    <ItemGroup>
        <PackageReference Include="Microsoft.CodeAnalysis.Analyzers" Version="3.11.0">
            <PrivateAssets>all</PrivateAssets>
            <IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
        </PackageReference>
        <PackageReference Include="Microsoft.CodeAnalysis.CSharp" Version="4.12.0"/>
        <PackageReference Include="Microsoft.CodeAnalysis.CSharp.Workspaces" Version="4.12.0"/>
    </ItemGroup>

    <!-- skipped for clarity  -->
</Project>

Afterwards, you could create a C# class, inheriting from DiagnosticAnalyzer class and marked with [DiagnosticAnalyzer] attribute. It is important that a unique diagnostic id defined, this could be used later on when implementing a code fix to relate that code fix to this analyzer, then defining a DiagnosticDescriptor, which describes the characteristics of this diagnostic, such as usage, what message to be shown and which severity level this diagnostic has by default.

[DiagnosticAnalyzer(LanguageNames.CSharp)]
public class MaybeSemanticAnalyzer : DiagnosticAnalyzer
{
    // Preferred format of DiagnosticId is Your Prefix + Number, e.g. CA1234.
    public const string DiagnosticId = "SHG001";

    // omitted for clarity	
		
    private static readonly DiagnosticDescriptor Rule = new(DiagnosticId, Title, MessageFormat, Category,
        DiagnosticSeverity.Warning, isEnabledByDefault: true, description: Description
    );
		
    // omitted for clarity	
}

By inheriting from DiagnosticAnalyzer, it is required to override two members, 1. the Initialize method, in which we are subscribing to compile time actions; (syntax analyzer and semantic analyzer are different in here because of subscribing to different actions) and 2. the SupportedDiagnostics array, the rules returned by this property are the ones this diagnostic could report, so it means that an analyzer could potentially report more than one diagnostic.

    // omitted for clarity

    public override ImmutableArray<DiagnosticDescriptor> SupportedDiagnostics { get; } = [Rule];

    public override void Initialize(AnalysisContext context)
    {
        // You must call this method to avoid analyzing generated code.
        context.ConfigureGeneratedCodeAnalysis(GeneratedCodeAnalysisFlags.None);

        // You must call this method to enable the Concurrent Execution.
        context.EnableConcurrentExecution();

        // Subscribe to semantic (compile time) action invocation, e.g. throw .
        context.RegisterOperationAction(AnalyzeThrowStatements, OperationKind.Throw);
    }

   // omitted for clarity

The second parameter of the RegisterOperationAction accepts an OperationKind because in our diagnostic we want to analyze throw statements we are registering only for those nodes in our syntax tree. The first argument is a method accepting OperationAnalysisContext which gives us access to the operation that is the subject of the analysis and its semantic model. There is also a Compilation property of type Compilation which represents a single invocation of the compiler, and this is the one that includes the subject operation. Through the Operation property we also have access to the syntax node under inspection, this node was analyzed to produce the operation.

As mentioned in the solution part first we want to get information about the return type of the method, for that we navigate up to get to MethodDeclarationSyntax and then getting the symbol representing a method or a method-like symbol like constructor, etc. to access the type of its return type.


private void AnalyzeThrowStatements(OperationAnalysisContext context)
{
    context.CancellationToken.ThrowIfCancellationRequested();

    var containingMethodSyntax = GetContainingMethod(context.Operation.Syntax);
    var containingMethodSymbol = context.Operation.SemanticModel.GetDeclaredSymbol(containingMethodSyntax) as IMethodSymbol;

    var returnType = containingMethodSymbol?.ReturnType;

    // omitted for clarity
}

private MethodDeclarationSyntax GetContainingMethod(SyntaxNode syntax)
{
    while (true)
    {
        if (syntax.Parent is MethodDeclarationSyntax methodDeclarationSyntax)
            return methodDeclarationSyntax;

        syntax = syntax.Parent!;
    }
}

The GetContainingMethod is a recursive function that its break point is where it observes a SyntaxNode which its parent is MethodDeclarationSyntax; when having access to this syntax node, we could ask the SemanticModel that was used to generate this Operation to give us the symbol (IMethodSymbol) that represents this node, it gives us access to the ReturnType of the method.

PS: To get the semantic model of every node that is of type declaration, such as MethodDeclarationSyntax, ClassDeclarationSyntax etc. we should use GetDeclaredSymbol from the SemanticModel.

Next step is to get type symbol for the Maybe<T> class, since we do not have access to its syntax node, as it is in another file, we could as the compilation to provide us with its type metadata by providing its FQDN(Fully-qualified Name): Sample.Fx.Maybe`1, the back tick demonstrates a generic type and the number is its type argument:

INamedTypeSymbol maybeType = context.Compilation.GetTypeByMetadataName("Sample.Fx.Maybe`1");

Last step in implementing our diagnostic is to compare these two type with each other, if they differ, we just return, if not, reporting the diagnostic at the location in the file where the SyntaxNode represents would be ideal, the location is accessible via GetLocation method from SyntaxNode

if (!returnType.OriginalDefinition.Equals(maybeType, SymbolEqualityComparer.Default))
    return;

var diagnostic = Diagnostic.Create(Rule, context.Operation.Syntax.GetLocation());
context.ReportDiagnostic(diagnostic);

The final steps are to reference the analyzer project in the project we need it, and then building the solution/project, keep in mind to mark the reference with OutputItemType="Analyzer" and ReferenceOutputAssembly="false".

<ItemGroup>
        <ProjectReference OutputItemType="Analyzer" ReferenceOutputAssembly="false"
                          Include="..\Sample.Analyzers\Sample.Analyzers\Sample.Analyzers.csproj"/>
</ItemGroup>

Now, if you navigate to the file containing the throw statement, you should see a yellow squiggle under the whole throw statement.

Conclusion

In software development we always want to have a short feedback loop, whether it is from end users of our application or we developers as the end user of the compiler; often, it could happen that we want to create a custom code analyzer for specific scenarios that could not be covered by other available tools like .editoconfig file. In these cases, we could leverage the power of Roslyn Analyzers. To create one we just need to inherit from the DiagnosticAnalyzer class and override the Initialize method, where we could subscribe to specific operations that are the outcome of the semantic analysis phase, at its completion our provided method will be executed.

Thanks for reading so far, enjoy coding and Dametoon Garm [2]

Resources

Buy Me a Coffee at ko-fi.com

Create Syntax Trees using Roslyn APIs

Sometimes when we want to generate code whether it is source generators or code fixes for a code analyzer, it is required to know how a syntax tree could be created using the Roslyn Compiler API. There are two ways, that we will discuss them in this post.

Test Your Roslyn Code Analyzers

In the previous post we have seen how it is possible to create a diagnostic analyzer using Roslyn APIs. Being a TDD advocate it would be disappointing not to talk about how we could test our code analyzer! So let's take a look into it.

Rolsyn Code Fix Providers to the Rescue

In the previous post in the Roslyn series we have explored how we could create code analyzers and how to test them, to reduce the cognitive load on our development team! However, what if we take an extra mile and create a Code Fix Provider to provide some suggestions for the developers! Shall we?

An error has occurred. This application may no longer respond until reloaded. Reload x