Getting started creating an XML-based language superset and editing interface



I am a programmer who writes a lot of ARM assembly, and am very accustomed to the interface provided by a program called The Interactive Disassembler (aka IDA). This is what it looks like:

If you couldn’t tell from the picture, all of the comments (text preceded by an @ symbol) are auto-generated by the program. In fact, the instructions are too; everything that is displayed in the viewport is stored in a superset of GNU assembly and is generated at runtime. This provides incredible malleability over what is rendered to the output.

What I am wanting to do involves creating an XML spec for my own superset of assembly that does some of the things IDA does, and then creating an Atom package that not only renders that output to a TextEditor instance, but also provides different submenus on Atom’s right-click context menu, depending on what item you clicked over. Ergo, I do not wish to see any XML in the TextEditor instance, but only dynamically rendered ASM.

I have a few questions about this:

  1. How can I get started with creating an Atom package? I have worked with themes before, but have no experience with functional mods.
  2. What parts of Atom’s API do I need to manipulate to execute this? I’ve browsed enough to find the `TextEditor` class, and know that it is probably what I wish to be changing in my package.
  3. How difficult will it be to bypass Atom’s behaviour of displaying raw text? What about creating my additions to the default context menu?
  4. Will I need to do this in CoffeeScript, or may I use JavaScript instead?


The Flight Manual and the documentation should give you a general idea and Package Generator: Generate Package/apm init --package create the basic files needed for a package. You can also look at code from other packages to see how things should be done (it would be difficult to explain everything in one comment :wink:).

You can use Javascript, if you want.

I never worked with this IDA thingy, so correct me if I’m wrong here: You want to create a package that …

… provides a command (ida:render-assembly or something) that …
… opens a read-only(?) text editor, …
… fills it with the text I see in your screenshot by parsing a XML file(?), …
… highlights the rendered assembly code properly and …
… provides a few other commands in the context menu.

Well … many parts.

  • atom.commands.add provides a command
  • and atom.workspace.addOpener display the text editor.
  • For syntax highlighting, you’d have to roll your own grammar (maybe someone already created one for assembly?).
  • The ROM:ADDRESS part of each line should be moved into the editor’s gutter.

There are many ways to do this.
You could use atom.workspace.addOpener to prepare a TextEditor before displaying it. The only problem is that some parts are undocumented.

See here. The default packages generated by package-generator and apm init have a menu sub-folder with a CSON file that adds items to the context menu. If you need to create context menu entries dynamically, use atom.contextMenu.add.

If there are any questions (and I am sure there are), feel free to ask them. Keep in mind that this is going to be a bit more complicated than the word count package in the flight manual and I can only give you a few links and API calls or maybe a code snippet.


I’m a noob and I can’t really be very helpful, but this is along my interests regarding Atom (both in terms of package design and the bits that aren’t sufficiently documented yet), so I’d like to follow along.


Sorry that I wasn’t more clear about this, I will expand.

Rather than providing a disassembly like IDA does, I want to provide an assembly that is much nicer to look at and work with than plain text. I will note some differences as I go along. The reason I brought IDA into the discussion was to illustrate how beautifully it renders ASM code, which is what I would like to do with this package.

Some parts would be read-only—such as comment headers and pieces that should be edited carefully—while other parts should be open to typing, such as blocks of code. Upon hitting return, each line of code could possibly be verified that it makes sense as well, however I may not implement that right away as it requires creating checks for more than just dialect-agnostic grammar. I would like this package to be able to work with more than just a set of specific dialects of assembly, and possibly even with other assemblers besides as. That said, I do want to implement validation for what dialect-agnostic syntax is there, so it can be parsed properly into the XML file.

Yes, put simply. Commands could be issued to change the names of subroutines (aka functions), add new items into the assembly, etc.

I’m the maintainer of probably the least-broken language grammar package for ARM assembly, which I created mostly because the only other ARM ASM package was riddled with holes and bugs with highlighting, and I didn’t feel like opening a pullreq and waiting for its maintainer to approve it and whatnot. :stuck_out_tongue:

Integrating this package with that should be easy enough.

There will be a suite of commands that this package provides for writing ASM, actually. I want to provide a couple submenus inside the context menu that have a slew of things to do, and also… I should provide keyboard shortcuts as well. To avoid usage collisions as best I can, I was thinking of having my own namespace of keyboard shortcuts accessed by pressing a certain combination, followed by whatever I want. Can that be done without triggering anything while inside the state of my “namespaced” master keyboard shortcut?

I would like to trigger this special rendering system only when I open files with the .asmx extension. What part of the API should I look into to do that?

As I mentioned before, I maintain a package for ARM assembly. For non-ARM ASM, such as x86, MIPS, or PPC… I presume the other grammar packages that exist for those dialects should work.

Oh, that would only exist in a dissassembly, just by the nature of the thing. Since this package works with regular assemblies, there’s no way for the editor to know what address things are going to be fitted into during compilation.

That said, it would be helpful to provide function offsets in the gutter instead of absolute addresses. Knowing how many bytes you are from some relative landmark can be helpful, so I may need to look into how to manipulate the gutter anyway. I don’t want to chuck the line number, though!

Oh, and one more thing: How difficult will it be to research what’s undocumented in Atom? Its developers comment well, right?


Yes. You can have your package’s keys stored in a separate keymap file and pull it in when you want the user to use those bindings. Only files in the keymaps directory get loaded automatically, so a keymap located somewhere else could be called and dismissed on demand without affecting any other part of the package.

You could use an activationHook in your package.json, as described in the Word Count example in the Flight Manual.


Atom’s TextEditor, by default, is WYSIWYG and writeable. A partially read-only text editor with comments that should not exist on disk is not officially supported (but it’s not impossible either).

I assume the assembler reports compilation errors with line numbers, so you could work with the Linter package to display those errors.

You could move dialect-specific code to other packages via services, but, for the time being, you should focus on creating a working version for as first.

Atom supports multi-key bindings. See the Flight Manual.

I take back my idea to use atom.workspace.addOpener (this solution only made sense in my original, read-only approach). Instead, use atom.workspace.observeTextEditors to filter out TextEditor instances of asmx files. Something like

atom.workspace.observeTextEditors (editor) ->
  if editor.getPath()?.endsWith '.asmx'

You can use -> Try CoffeeScript to get the JavaScript version of this code (but these 3 lines are probably easy to understand).

You can create a gutter that is separate from the gutter used for line numbers.

Most of atom’s core is written in CoffeeScript, but it’s very readable.