Canonical list of syntax scope names?


#1

Is there an official / canonical list of syntax scope names for use in syntax highlighting?

What I mean: As an author of a syntax theme, is there a list of all the scope names I can expect to see mentioned within a language package? And, going the other way, as a language package author, is there a list of all the scope names I can expect syntax themes to bind to?

I can of course inspect the source of popular syntax themes and language packages [*] to get a rough idea, however I’m hoping there’s a document somewhere that is “arm’s length” from both sides, that could be consulted as an authoritative source of information.

Help? Thanks.

[*] And as a “historical” source, there’s always the salient TextMate doc, http://manual.macromates.com/en/language_grammars.


#2

Even the TextMate manual was more like guidelines than a canonical list. It would be good for the Atom community to publish some guidelines though, to be sure.


#3

No argument! I didn’t mean to suggest that TextMate had the be-all end-all of docs in this regard.


What is .syntax--meta.syntax--separator?
#4

Many, if not most, of the Atom language packages (both official and third-party) are generated directly from TextMate grammars too. (Including my language-r package :grinning:)

Maybe we could start the discussion here? Is there anything you would like to add/subtract/change to/from/about the TextMate doc?


#5

As context: I have very limited experience building language grammars — I’ve built one very hacky grammar for SublimeText (which I’ve mechanically converted for Atom) — and I have effectively zero experience building syntax themes. It’s my desire to up my game on both fronts that led me to ask the question. In particular, I’m simultaneously trying to build a language grammar along with a syntax theme which I hope will be compatible with it as well as a decent-sized handful of the usual suspects. I’m hoping that I don’t have to spend an inordinate amount of my time looking at dozens of themes and grammars in order to figure out what tags to emit (for the grammar) and expect (for the theme). I especially hope that I don’t have to do this, only to find out later that there is an as-yet hidden consensus among the core developers that something major is or is about to change, negating all that time I just spent trying to figure out the status quo.

Is there anything you would like to add/subtract/change to/from/about the TextMate doc?

The TextMate doc is okay to impart a flavor for what to expect, however it’s far from complete. I have no specific suggestions in terms of actual detail; I don’t know the territory well enough yet. What I think I’d suggest is just: More detail, and fewer weasel words.

It’s great that the syntax markup system is set up in the classic “mechanism, not policy” style. But in terms of Atom as a community, policy is essential, to maximize the impact of both syntax themes (which will be able to succeed in styling more languages) and language grammars (which will be compatible with more syntax themes).


#6

Well, that’s the thing. Atom is inheriting a legacy from TextMate of a huge number of language grammars that had what you cited (or less) to guide them. So the question is, do we make things more rigorous … with the cost of attempting to force grammars that already exist to conform? Or do we keep things open like they are … and put the burden on the folks like you in the form of a large learning curve?


#7

I understand that Atom has inherited a poorly-specified legacy, filled with the evidence of many ad-hoc decisions, a fair amount of cargo-cult code, and so on. And despite the mess, it all has value worth nurturing. I am not trying to suggest that anyone gratuitously break old grammars or syntax themes.

What I’m suggesting is that, as a way to move forward, it would be beneficial to define a more rigorous policy / standard / API for syntax markup. This policy would have a lot in the way of SHOULDs and MUSTs and MUST NOTs, which could guide both new development (new themes, new grammars) and also help inform authors as they evolve legacy packages towards a more explicit common consensus.

But details notwithstanding, what I really want to understand at this point is if this kind of documentation is something that would be embraced by the core Atom team or just be considered pointless noise. I’m happy to contribute some of my time trying to codify whatever seem to be the dominant existing practices / the de facto API, if it wouldn’t be wasted effort. (To be clear, I would have no problem if this isn’t something the core Atom team wants; different folks have different priories and all that.)

More concretely, if I wanted to put together an inaugural PR to start this document, where (what path on what repo) is the best place for it to go?


#8

I think the answers to your questions are:


#9

If you are still interested, you find here an exhaustive list of scope names extracted from the currect core modules’ grammars.


Jedit-syntax2.0
#10

Interesting. May I ask - how did you get to this list? Is there a command which can be executed within the console?

Then… this list would be as accurate as the syntax groups installed with Atom on someone’s PC.

Thanks for the share.


#11

Very nice.


#12

This is not as simple as a keypress, but not too complicated either: you will need the copy-find-result package.

  1. Open the core node_modules folder with Atom
  2. Find all in project (Ctrl+Shift+F) RegEx "name":"[^"]*" under directory pattern */grammars/*.json
  3. Copy-find-result:Text into a new editor pane
  4. Replace all in buffer (Ctrl+F) RegEx ^.*"name":"([^"]*)".*$ with $1
  5. Replace all in buffer (Ctrl+F) Case Sensitive RegEx ^(.(?!$))*[^0-9a-z.-].*$ with empty string (which actually removes illegal lines).
  6. Edit / Lines / Sort
  7. Edit / Lines / Unique

#13

Where do you find this?


#14

Windows: %localappdata%\Atom\app-[VERSION]\resources\app\node_modules
(Atom x64 if using the x64 version)
I assume it’s similar on other platforms.

This used to be stored inside an asar but we got rid of that in 1.17.