Giving semantic meaning to C# with Roslyn

Marcus • June 16, 2020

A hand holding a lens, through which a sharp image of a dock can be seen.

Now that we had a quick peek at how the syntax analysis with Roslyn works, we’ve established a baseline level of code understanding that, on a purely functional level, can tell you as much as a good structural text editor.

We can go a step further though - by giving meaning to abstract expressions.

Understanding semantics

To interpret meaning into arbitrary symbols found in the code, Roslyn provides us with the semantic model. If, for the last post, you couldn’t (yet) load the solution with project- or package-references, that’s important now: To give deeper meaning to the code, knowing what code we could possibly invoke at any time is relevant.

Let’s try to look into it with the following example:

public class Demo
{
  public void DoSemanticStuff()
  {
    var obj = new B();
    string name = "X";
    Console.WriteLine(obj.SayHello(name));
  }

  class A
  {
    public virtual string SayHello(string name) => $"Hello, {name}";
    public string SayHello(int x) => $"Who, {x}?";
  }

  class B : A
  {
    public override string SayHello(string name)
      => $"Good Morning, {name}";
  }
}

That’s a piece not exactly trivial to grasp, especially if the called method is in different files, projects or in another class library. Here’s what I want to know:

Looking into our symbols

The easiest of the three to answer is provided through symbol information available in our semantic model. Here’s how a simple implementation would look like for a first test:

public class SemanticVisitor : CSharpSyntaxWalker
{
  private readonly SemanticModel _model;

  public SemanticVisitor(SemanticModel model)
  {
    _model = model;
  }

  public override void VisitInvocationExpression
    (InvocationExpressionSyntax node)
  {
    if (node.ToString().StartsWith("obj.SayHello("))
    {
      Console.WriteLine(node);
      Console.WriteLine(_model.GetSymbolInfo(node).Symbol);
    }

    base.VisitInvocationExpression(node);
  }
}

Okay, maybe not the best code we’ve seen so far, since it, very exactly, checks for the method call to contain obj.SayHello(). Refactoring your code later to rename that variable? You’ll never find it again. But you could, if you evaluate the symbol.

In particular, we have an object invocation expression, which typically is a method call. The first ToString() call returns the text representation, as it is found in the source file: obj.SayHello(name).

And as part of the second step, we fetch the actual method that we’re about to invoke: RewritingWithRoslyn.Demo.B.SayHello(string). A-ha, hat’s the actual meaning of our code.

The example is - very intentionally - using virtual/override calls, so you can experiment a bit:

Looking at the types used

One particularly interesting use case I found so far is looking at the actual type involved in a particular expression. And that’s pretty straightforward, too, if you know what you’re vaguely trying to aim for.

Simply put, with the code above, we can easily determine what arguments the call has:

public override void VisitInvocationExpression
  (InvocationExpressionSyntax node)
{
  if (node.ToString().StartsWith("obj.SayHello("))
  {
    Console.WriteLine(node.ArgumentList.Arguments[0]);
    Console.WriteLine(_model.GetTypeInfo(
      node.ArgumentList.Arguments[0].Expression).Type);
  }

  base.VisitInvocationExpression(node);
}

This outputs two things:

Now if, instead of passing in a variable, we’d rather invoke obj.SayHello("X"), this output would change a little: instead of name, we’d see "X" (but it’s still a string).

And if you call obj.SayHello(3) instead, you’d have 3 and int on your console.

There’s obvious limitations, however: You cannot determine a type more precisely than the C# compiler can. If you receive an object and pass it through to an underlying method, Roslyn doesn’t know more than “it’s an object”, and thus can’t provide you with more details.

I mean, you could resolve it recursively…[1]

For my needs with one-off processes, that’s about it for how comprehensively I’ve had to understand Roslyn to work with it: Interpreting symbols, retrieving fully qualified names, and (if possible) rewriting them automatically.

You can view the full source code on GitHub.


  1. But that’s outside of the scope of this article. Good luck though!
}