(open) Making a custom Assembler type grammar work comfortably in Atom


#1

Hello.

I am requesting a kickstart. I wish to augment Atom’s behaviour when editing a specific file extension.
I am asking the community to point me in the right direction on what to study to make me capable of doing the work.

The first phase is to have the following “indentations” (?) -

NETWORK
TITLE = Clear the TIME value

      A     #inReset; 
      O(    ; 
      AN    #inEnable; 
      AN    #inHold;          // Value not held
      )     ;
      JCN   n001; 
// Load zero value     
      L     0.000000e+000;    // clear time value variables
      T     #timeValue; 
      T     #outTV; 
n001: R     #outDone;         // clear done output 
      BEC   ;                 // return if used reset

My thought is to create a language to bind the “AWL” file extension to the behaviour.
Atom should then be able to identify that the active editor has this file type open.
Another

I do not know how to create a new language - even if as light weight as this one.

So what you see in the code segment above is a type of assembler language.
Each segment is started with a network title (easily handled with a snippet).
The code is distributed in 4 columns.

  1. (optional) Jump label identified by :. Comment by //
  2. (required) Mneumonic or )
  3. (optional) Tag / number / label
  4. (optional) Comment.

So the idea is that the indenting is worked out each time space or enter is pressed.
But only if it is a AWL file.
a) Typing A then space, will bring A to column 2 and the cursor onto column 3.
b) inHold with space modifies the variable with prefix # and postfix ; (… only if needed).
c) The cursor moved to column 4 placing the // for the comment. space in this area will be ignored.
d) Pressing enter with an empty comment, removes the comment.
e) enter will always go to a new line but might add # and ; if and when needed.
f) left and right buttons should skip white spaces when navigating through the code.

The steps (a) to (f) is just to give an idea on how Atom should handle typing.
I think I can figure out from the API and regular expressions to get the behaviour.
What I do not know: (ask study material: documentation references / documented examples)

  • Creating a light weight language to link AWL extension to behaviour.
  • Where is the coding done… general package or grammar linked package?
  • How to handle: ( automatically adds ). Deleting (, deletes ).
  • Apply changes only to the file type
  • Live monitoring of user typing (space, enter, left, right, …) to trigger code for responses.

(future) Phase 2 will handle tag declarations…

VAR_INPUT
  inEnable      : BOOL ;	// Run timer
  inHold        : BOOL ;	// Timer type ... FALSE: TON | TRUE: Time accumulator
  inReset       : BOOL ;	// Clear time value
  inCycleTime   : INT ;	    // [ms] Value of last processor cycle
  inTarget      : REAL ;	// [s] Run time duration before DONE is switched on
END_VAR

VAR_OUTPUT
  outDone       : BOOL ;	// Target time value reached
  outTV         : REAL ;	// Current accumulated time value
END_VAR

VAR
  timeValue     : REAL ;	// [s] Current time value
END_VAR
VAR_TEMP
  _done         : BOOL ;	//Time target reached
END_VAR

(future) Phase 3 will try to make the tags available in code by autocomplete.
The idea is - when starting typing # suggestions are made for symbol tags.

A push in the right direction for my long term project will be appreciated.

Regards.
- Dan Padric


Is it possible to programmatically generate syntax highliting, without using a grammar?
[Solved] Creating a new package for a custom language in Atom
#2

This is a large multipart project, so I’m only going to address the first part right now: making a language package. I’ve written up a grammar boilerplate with annotations to help myself and others understand what’s going on, and I’ve made a few small language packages for different people. Language packages can include code to modify Atom’s behavior, but they should only do so if the additional functionality and the language are married to one another.


#4

Hello.

Feedback (no action required)


Yesterday I spent some time on this project. Your notes was very helpful - Thank you very much!

The ‘cheat notes’ contained in your notes was at first overwhelming - too much information. BUT working through the web page linked in your notes and working through them in parallel, resolved this.

It is a bit tough getting the right regular expressions on some of the searches. Numerical constants can be some of the following:

  • B#2#0100_0011
    – word size: B / W / DW … (optional)
    – notation: 2 / 16
    – spacing: [_] … (optional)
  • REAL#0.5 vs 0.5
  • 8 vs +8 vs -8
  • S5T#20S500MS vs S5T#20.5S
    – Time notation: S5T / T
    – notation: D / H / M / S / MS
  • Some more exists but should not interfere with #variable

This part I should figure out on my own.

Follow up question


The ability to fold code… how to ensure this?

At the moment indentation is making the code fold. I wish to remove this and place it under syntax control. Just by using the syntax should switch off this built-in ability.

Several places needs folding.

VAR // also.. VAR_TEMP | VAR_INPUT | VAR_OUTPUT
  {declarations.. indentation}
END_VAR
BEGIN
{code, no standard indentation} 
END // variations like: END_FUNCTION_BLOCK
  A   #IN1
  A(         // follows with no indentation!
  O   #IN2
  O   #IN3   // more () may be included
  )
  =   #OUT1

Any hints on where to look?

Best regards,
  dP


#5

To showcase that something is actually happening in the project…

# CSON file allows for comments where JSON does not!
scopeName: 'source.AWL'
name: 'Siemens STL'
fileTypes: [
  'AWL'
  'awl'
]
# Folding capture
foldingStartMarker: '^(\\s*)?(BEGIN|VAR_)\\w*$'
foldingStopMarker: '^(\\s*)?(END_\\w*)$'

patterns: [
  
  {
    # Line commentary starting at line start
    include: '#line_comments'
  }
  {
    # Boolean: true
    match: '(?i)\\btrue\\b'
    name: 'constant.language.boolean.true.stl'
  }
  {
    # Boolean: false
    match: '(?i)\\bfalse\\b'
    name: 'constant.language.boolean.false.stl'
  }
  {
    # Numerical: Binary
    match: '(?i)(?:\\s)(([a-z]*[#])?(2)[#]([0-1_]+))(?:\\s|;|$)'
    captures:
      '1':
        name: 'constant.numeric.integer.binary.stl'
  }
  {
    # Numerical: Hexadecimal
    match: '(?:\\s)(([a-z]*[#])?(16)[#]([0-9a-z_]+))(\\s|;|$)'
    captures:
      '1':
        name: 'constant.numeric.integer.hexadecimal.stl'
  }
  {
    # Numeric: Decimal number
    match: '(?:\\s)[+-]?\\d+(?:\\s|;|$)'
    name: 'constant.numeric.integer.decimal.stl'
  }
  {
    # Name of function
    match: '\\b(?i:((?:FUNC|ORG|UDT)\\w*)\\s*["](\\w*)["])'
    captures:
      '1':
        name: 'storage.type.function.stl'
      '2':
        name: 'entity.name.function.stl'
  }
  {
    # NOT WORKING
    begin: '^(BEGIN)'
    end: '^(END_\\S*)'
    name: 'storage.type.function.stl'
    contentName: 'constant.numeric.integer.decimal.stl'
  }
]

repository:
  line_comments:
    # Represents commentary lines (..as example of include)
    name: 'comment.line.double-slash.stl'
    match: '^[/]{2}.*?$'


#6

It’s a cheat sheet, so that’s to be expected, and that’s why I left myself the link to the manual in case I needed it. The thing the manual doesn’t cover is syntax, since Atom’s grammars are in a different format than TextMate’s, so I made the cheat sheet to fill the gap.

I don’t think you can. Atom’s API doesn’t have great support for managing folds. There are packages that enable custom folding schemes, like custom-folds, but they (to my knowledge) all work by creating a selection and then folding it, which is the one thing the API allows us to do.

Is indentation forbidden?


#7

Feedback report. No action required.

Hello.

DamnedScholar:

DanPadric:

The ‘cheat notes’ contained in your notes was at first overwhelming - too much information. BUT working through the web page linked in your notes and working through them in parallel, resolved this.

It’s a cheat sheet, so that’s to be expected, and that’s why I left myself the link to the manual in case I needed it. The thing the manual doesn’t cover is syntax, since Atom’s grammars are in a different format than TextMate’s, so I made the cheat sheet to fill the gap.

My comment was not aimed as critique to your efforts. More sharing of experience for others to take notice of. Your notes were very helpful as soon as I came into the flow of it.

DamnedScholar:

DanPadric:

The ability to fold code… how to ensure this?

At the moment indentation is making the code fold. I wish to remove this and place it under syntax control. Just by using the syntax should switch off this built-in ability.

I don’t think you can. Atom’s API doesn’t have great support for managing folds. There are packages that enable custom folding schemes, like custom-folds, but they (to my knowledge) all work by creating a selection and then folding it, which is the one thing the API allows us to do.

Is indentation forbidden?
It is a bit complicated.

The project you referred to, seems interesting. At surface inspection, it seems as though the ability is there to have custom strings to indicate fold start and end.

It does not seem as though Atom abide by the rule set in the CSON / JSON file. Folding should occur to this configured rule, regardless to the indentation. Even in my custom language folding is possible by default where text is indented.

Have you ever worked with assembler? The variation I am using brakes the rules when it comes to following through with indenting text. See the intro code piece - note the jump label. A jump label follows the following rule:

/^[a-zA-Z][a-zA-Z_0-9]{0,11}\:/  # Example: ovr1:

The code belonging to the function is encapsulated(?) by…

/^BEGIN$/ 
/END(\S*)$/  # Example: END_FUNCTION | END_FUNCTION_BLOCK

‘Segments’ are a bit tricky. There can be several of those within a function. Those start with

/^(?i)Network\s*Title = (.*)$/ # capture 1: Comment title

The ‘segment’ ends on a new network title or with the end of the function. A single indentation of code does occur, with the the exception of the jump label mentioned earlier. No more indentation occurs (see intro example).

Identifying the ‘section’ is difficult when using begin: and end: in the pattern recognition. But I understand there might be a begin: and when:. I’ll have to see if that is available in Atom.

Code has the ( and ) but that should be ignored when it comes to folding and indentation.

Handling the code within BEGIN and END now looks like:

{ # Code between BEGIN and END
  begin: '^BEGIN$'
  end: '^END([a-z_A-Z])*$'
  patterns: [
    { # Network title
      include: '#network_title'
    }
    { # Line commentary starting at line start
      include: '#line_comments'
    }
    { # Jump label
      match: '^(.{0,100})[:]'
      captures:
        1:
          patterns: [
            {
              match: '^\\s*([a-zA-Z][\\w_]{0,11})$'
              captures:
                1:
                  name: 'entity.name.function.stl'
            }
            {
              match: '.*'
              name: 'invalid.illegal.stl'
            }
          ]
    }

That is all for now. I will end with a picture:

Best regards,
  dP


#8

I have not. I don’t know anything about the rules, so all I can do is ask pointed questions.


#9

I would not expect you to know. But now I know I need to expand a little on the code structure. This is not a big issue. I will expand this comment a little, later within the next hours.

The following should help with understanding the core structure of the coding:
STL elements


#10

Next question please…
Where do I make the configuration to tell Atom
which ASCII characters to use for commenting out some code?

For example…

  1. Python code as follows:
print('Hello World!')
  1. Press Ctrl+/

  2. Code becomes:

# print('Hello World!')

I would like to activate this feature for my custom grammar too.
Please - How do I do it?

Cheers.


#11

That would be a scoped setting that you can set in the language package or config.cson (which means that you can add your own comment start strings if you want).


Ask about making my package (user) configurable?