Custom grammar based on external tokenizer


Despite reading Syntax highlighting using existing tokenizer, 14927 and similar posts, I’m still at a loss for creating an Atom grammar based on an external tokenizer.

Specifically, I’d like to use the same internal tooling used by Xcode to highlight Swift code from within Atom for more accurate results than are possible through cson specifications.

I’ve built a command-line tool to expose this tokenization, but now I’d like to hook it up to Atom. For reference, the command line tool can be found here:

The tokens are returned in the simplest form necessary for syntax highlighting: offset, length & type.

I’m sorry if this is too vague a request, it feels like it should be quite simple to do given better knowledge of the inner workings of Atom grammars, but it’s a bit outside my reach.

Thanks for reading,
A hopeful package builder


Atom really isn’t designed to support external tokenizers. There are a couple packages that do it (the atom-typescript package comes to mind), but they have to do some kind of hacky things to get it done. You might want to look at the atom-typescript package to see how they do it and model your approach off of theirs.


Atom really isn’t designed to support external tokenizers.

That’s definitely the vibe I’m getting as well. I dug through the Typescript package a bit yesterday, and it does a lot more, and it’s written with TypeScript itself too. So it’s a bit too dense for my ability to distill 1) how it’s doing its tokenizing and 2) how I’d even go about running it locally to debug and adapt towards what I need.

But that’s probably my best bet at understanding the underlying grammar toolchain, so I’ll continue to attempt to grok it.

Thanks for your response!


Would anyone happen to know of other (hopefully simpler) atom packages generating grammars from external tokenizers? I’m still very interested in building something like this, should I come to understand how this works :grimacing: .