MarkdownDeep Syntax Highlighting

Tuesday, March 29th, 2011     #markdowndeep #everything

MarkdownDeep now provides the hooks needed to inject your own syntax highlighting into rendered code blocks.

Although MarkdownDeep itself can't do syntax highlighting, it now contains the hooks required to inject your own. In my case all I wanted to do was insert the appropriate HTML attributes to enable Prettify, so just the ability to hook the rendering of code blocks was all that was needed:

Func<Markdown, string, string> FormatCodeBlock
A function delegate that when set is used to format a code block

Here's the implementation I use in Jab to inject the required PrettyPrint attributes:

public static Regex rxExtractLanguage = new Regex("^({{(.+)}}[\r\n])", RegexOptions.Compiled);
public static string FormatCodePrettyPrint(MarkdownDeep.Markdown m, string code)
{
    // Try to extract the language from the first line
    var match = rxExtractLanguage.Match(code);
    string language = null;

    if (match.Success)
    {
        // Save the language
        var g = (Group)match.Groups[2];
        language = g.ToString();

        // Remove the first line
        code = code.Substring(match.Groups[1].Length);
    }

    // If not specified, look for a link definition called "default_syntax" and
    // grab the language from its title
    if (language == null)
    {
        var d = m.GetLinkDefinition("default_syntax");
        if (d != null)
            language = d.title;
    }

    // Common replacements
    if (language == "C#")
        language = "csharp";
    if (language == "C++")
        language = "cpp";

    // Wrap code in pre/code tags and add PrettyPrint attributes if necessary
    if (string.IsNullOrEmpty(language))
        return string.Format("<pre><code>{0}</code></pre>\n", code);
    else
        return string.Format("<pre class=\"prettyprint lang-{0}\"><code>{1}</code></pre>\n", 
                            language.ToLowerInvariant(), code);
}

How does this work?

  1. MarkdownDeep calls out to this method whenever it renders a code block
  2. First we run a simple regex to look at the first line and try to extract the language name wrapped in double curly braces.
  3. If we don't find that, we look for a markdown link definition that specifies a default language to use. (see below)
  4. We do some simple manipulation for common language name variations
  5. Render the <pre><code blocks with the required PrettyPrint attributes.

So to specify the language to use for syntax highlighting, just prefix the the code block with the language name enclosed in double curly braces. (note any white space after the braces disables this functionality which is how I escaped the example below).

    {{C#}} 
    for (int i=0; i<100; i++)
    {
        Console.WriteLine("blah")
    }

Alternatively, you can use a specially named link definition to declare the default language for all code blocks on a page:

    for (int i=0; i<100; i++)
    {
        Console.WriteLine("blah")
    }

[default_syntax]:. "C#"

Pretty cool eh?

These improvements are available now in github and the latest NuGet packages.

« Older - PetaPoco - Value Conversions and UTC Times Newer - MarkdownDeep Head Block Extraction »