R Markdown Syntax Highlighting
Nov 11, 2017
Curtis Alexander

Prior art

In the past I had explored using the highr package to enable syntax highlighting of languages that were not supported within R Markdown. I never loved the solution as it required one to download and setup highlight. In addition, the solution also required some manual steps to be run from the command line to include the appropriate css file as well.

What is the problem again?

Repeating my original problem — I want to have syntax highlighting for SAS code within my R Markdown documents.

R Markdown relies on pandoc for its syntax highlighting. Specifically, the skylighting Haskell library is utilized by pandoc for its highlighting. The list of languages that are supported can be found here.

Courtesy RStudio

Courtesy RStudio

RStudio provides the nice illustration above describing how one begins with an R Markdown document and ends with an html document. pandoc is what ultimately performs sytnax highlighting using the aforementioned skylighting library.

The answer…CodeMirror

Rather than relying on pandoc, I determined a way to take advantage of the Javascript text editor CodeMirror when producing html documents. Again, CodeMirror is a Javascript text editor that I simply set to be read only which serves as a form of syntax highlighting.

Aside: In the end the answer always seems to be Javascript. For many reasons, I have avoided learning Javascript. But for my use case here, I must admit that Javascript is quite helpful!

R Markdown Example

To go straight to an example, take a look at either the R Markdown document or the rendered html. This repo will be utilized throughout as an example.

R Markdown Setup

There are quite a few steps required but once they are all stitched together, they allow for SAS (and other language) syntax highlighting that wasn’t possible before.

Requirements

The following R packages are required for the proposed solution.

In addition, one will need access to the internet as the solution makes use of <script> tags to read in Javascript libraries from a CDN.

Code Mirror Javascript/CSS

There is a stylesheet and Javascript files provided by CodeMirror that need to be included. My preference is to create a header.html file and then make sure it is included within my YAML header.

Below is an example of the YAML header for rmarkdown-with-alt-langs. Specifically take note of the in_header key.

title: "R Markdown with Alternate Languages"
author: "Curtis Alexander"
output:
  html_document:
    include:
      in_header: header.html
    mathjax: null
params:
  hilang:
    - sas
    - crystal

The file header.html contains the following. Note that I used links to a CDN rather than directly pointing to CodeMirror.

<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/codemirror/5.31.0/codemirror.min.css">
<script src="https://cdnjs.cloudflare.com/ajax/libs/codemirror/5.31.0/codemirror.min.js"></script>
<style>
  .CodeMirror {
    border: 1px solid #eee;
    height: auto;
  }
</style>

R Markdown Parameter

Within the yaml block above, the language(s) to be used for highlighting within the document is/are set with the hilang option. For a full list of all languages that can be utilized, take a look at the CodeMirror languages page.

_hilang_setup.Rmd

Next, one needs to create a knitr source hook in order to wrap up the SAS code in <textarea> tags. The file _hilang_setup.Rmd contains two R code chunks, including the actual source hook.

The first R code chunk pulls in the needed Javascript from CodeMirror for the languages set within the yaml block. Note that this block requires the asis option to be set.

# take a character vector of parameters and inject
#   the appropriate script tag for code mirror
# ensures that the script tags are only inserted once
for(i in seq_along(params$hilang)) {
  js_mode <- paste0("\n<script src=\"https://cdnjs.cloudflare.com/ajax/libs/codemirror/5.31.0/mode/", tolower(params$hilang[i]), "/", tolower(params$hilang[i]), ".min.js\"></script>\n")
  cat(htmltools::htmlPreserve(js_mode))
}

The second R code chunk sets the actual source hook.

knitr::knit_hooks$set(source = function(x, options) {
  if (!is.null(options$hilang)) {
    textarea_id <- paste(sample(LETTERS, 5), collapse = "")
    code_open <- paste0("\n\n<textarea id=\"", textarea_id, "\">\n")
    code_close <- "\n</textarea>"
    jscript_editor <- paste0("\n<script> var codeElement = document.getElementById(\"", textarea_id, "\"); var editor = null; if (null != codeElement) { editor = CodeMirror.fromTextArea(codeElement, { lineNumbers: true, readOnly: true, viewportMargin: Infinity, mode: 'text/x-", tolower(options$hilang), "' }); } </script>\n")

    # if the option from_file is set to true then assume that
    #   whatever is in the code chunk is a file path
    if (!is.null(options$from_file) && options$from_file) {
      code_body <- readLines(file.path(x))
    } else {
      code_body <- x
    }

    knitr::asis_output(
      htmltools::htmlPreserve(
        stringr::str_c(
          code_open,
          paste(code_body, collapse = "\n"),
          code_close,
          jscript_editor
        )
      )
    )
  } else {
    stringr::str_c("\n\n```", tolower(options$engine), "\n", paste0(x, collapse = "\n"), "\n```\n\n")
  }
})

The source hook doesn’t have to be in a separate file. However, I like to include in a separate file so that within my relevant R Markdown document, I just need to include the following to bring it into scope within the document.

```{r, child="_hilang_setup.Rmd"}
```

SAS Code

Next is the easy part — just drop the SAS code into an R Markdown code chunk. A code chunk needs the following options to be set — eval=FALSE and hilang="sas". Below is a simple example.

```{r, eval=FALSE, hilang="sas"}
data _null_;
  x = 1;
  put x=;
run;
```

The source hook just established identifes the SAS code within <textarea> tags and then applys CodeMirror’s Javascript to the entire block of text within the <textarea> tag.

Full Example

A full example follows — assuming one makes use of header.html and _hilang_setup.Rmd.

---
title: "R Markdown with Alternate Languages"
author: "Curtis Alexander"
output:
html_document:
include:
in_header: header.html
mathjax: null
params:
hilang:
- sas
- crystal
---
The purpose of this document is to demonstrate that an R Markdown document can display syntax highlighting for languages that are not supported natively by R Markdown. Highlighting is actually supported by [pandoc](https://pandoc.org/) and the [skylighting](https://hackage.haskell.org/package/skylighting) Haskell library. A full list of languages supported by pandoc can be found [here](https://github.com/jgm/skylighting/tree/master/xml).
The original reason for working through this was to enable [SAS](https://www.sas.com) syntax highlighting within an R Markdown html document.
For a full description of how this is accomplished and stitched together, refer to my [blog post](https://www.calex.org/blog/r-markdown-syntax-highlighting/).
## Requirements
The following R packages are required for the proposed solution.
* [stringr](https://cran.r-project.org/web/packages/stringr/index.html)
* [htmltools](https://cran.r-project.org/web/packages/htmltools/)
In addition, one will need access to the internet as the solution makes use of `<script>` tags to read in Javascript libraries from a [CDN](https://www.cloudflare.com/cdn/).
## Languages
### [SAS](https://www.sas.com)
Below is an example of highlighting SAS code. The code was taken from the [CodeMirror](https://codemirror.net/mode/sas/index.html) example page.
```{r, child="_hilang_setup.Rmd"}
```
```{r, eval=FALSE, hilang="sas"}
libname foo "/tmp/foobar";
%let count=1;
/* Multi line
Comment
*/
data _null_;
x=ranuni();
* single comment;
x2=x**2;
sx=sqrt(x);
if x=x2 then put "x must be 1";
else do;
put x=;
end;
run;
/* embedded comment
* comment;
*/
proc glm data=sashelp.class;
class sex;
model weight = height sex;
run;
proc sql;
select count(*)
from sashelp.class;
create table foo as
select * from sashelp.class;
select *
from foo;
quit;
```
<br/><br/>
### [Crystal](https://crystal-lang.org/)
Below is an example of highlighting Crystal code. The code was taken from the [CodeMirror](https://codemirror.net/mode/crystal/index.html) example page.
```{r, eval=FALSE, hilang="crystal"}
# Features of Crystal
# - Ruby-inspired syntax.
# - Statically type-checked but without having to specify the type of variables or method arguments.
# - Be able to call C code by writing bindings to it in Crystal.
# - Have compile-time evaluation and generation of code, to avoid boilerplate code.
# - Compile to efficient native code.
# A very basic HTTP server
require "http/server"
server = HTTP::Server.new(8080) do |request|
HTTP::Response.ok "text/plain", "Hello world, got #{request.path}!"
end
puts "Listening on http://0.0.0.0:8080"
server.listen
module Foo
abstract def abstract_method : String
@[AlwaysInline]
def with_foofoo
with Foo.new(self) yield
end
struct Foo
def initialize(@foo : ::Foo)
end
def hello_world
@foo.abstract_method
end
end
end
class Bar
include Foo
@@foobar = 12345
def initialize(@bar : Int32)
end
macro alias_method(name, method)
def {{ name }}(*args)
{{ method }}(*args)
end
end
def a_method
"Hello, World"
end
alias_method abstract_method, a_method
def show_instance_vars : Nil
{% for var in @type.instance_vars %}
puts "@{{ var }} = #{ @{{ var }} }"
{% end %}
end
end
class Baz < Bar; end
lib LibC
fun c_puts = "puts"(str : Char*) : Int
end
baz = Baz.new(100)
baz.show_instance_vars
baz.with_foofoo do
LibC.c_puts hello_world
end
```
<br/><br/>

Future Work

In the future, I would like to consider:

  • Creating my own custom output format
  • Creating an actual package with everything bound up that can be conveniently installed
  • See if I can further parameterize the configuration options that are passed to the CodeMirror.fromTextArea Javascript function.