Getting an abstract syntax tree of a Java code


#1

Hello,

I have been trying to get an abstract syntax tree of a Java code because I want to build a Java IDE within Atom, or, at least, build the jump-to-definition feature. I’ll deal with hooking up a build system later.

Anyway, since a grammar is recursive, I am guessing that Atom must build the tree behind the scene.

After searching for a while, all I have found is that TokenizedBuffer parses a text using a grammar and build an array of TokenizedLine. There’s no tree…

Looking at the code below:

class Hello {
  private static <T> void changeMethods(CompilationUnit cu) {
     List<TypeDeclaration> types = cu.getTypes();
     for (TypeDeclaration type : types) {
         List<BodyDeclaration> members = type.getMembers();
         for (BodyDeclaration member : members) {
             if (member instanceof MethodDeclaration) {
                 MethodDeclaration method = (MethodDeclaration) member;
                 changeMethod(method);
                 changeMethod(method);
             }
         }
     }
  }
}

When I click at “MethodDeclaration”, I can get the Token instance. But I cannot get its parent.

I wonder if anyone can help me. Thanks.


#2

Atom uses pattern matching for its syntax highlighting, AFAIK, not full-blown parsing / AST-traversal. This is for performance reasons I guess, and also for allowing to highlight badly formed documents.


#3

This is the biggest reason why editors use pattern matching instead of grammars. Most parsers don’t handle re-entering the parsing rules at an arbitrary point.


#4

Got it. This means I will have to build a syntax tree and somehow integrate it into Token myself.
I’m not sure how to start yet. If you guys can point me to something, that’d be great.
Thanks!


#5

There have been several topics talking about dynamic grammars. You can find the links to all of them here: