JavaScript Syntax Highlighting Issue


#1

Looks like with any long lines of code containing quotes syntax highlighting eventually fails.

Example:

var abc = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','and','z'];

Syntax highlighting will fail at ‘y’


Syntax highlighting parsing for long strings (java, groovy, etc.) fails
I have a highlighting problem about the arrays in Ruby
#2

Hi!

I investigated further and want to share a piece of code (not written by me) where it fails with, I think, interesting results.

Although the problem seems to lie with long lines of code, the amount of “units” to be parsed also seems to play a role:

// This line is longer than the ABC lines below, but the highlighting still works!
		$query = " SELECT * FROM wl_changelog WHERE " . " user_id='{$this->obj_prop['user_id']}' " . " AND rel_id='{$this->obj_prop['rel_id']}' " . " AND type='{$this->obj_prop['type']}' " . " AND operation='{$this->obj_prop['operation']}' "
        . " AND lang='{$this->obj_prop['lang']}' "
        . " AND timestamp >="
        . intval(time() - $tau)
        . " AND timestamp <="
        . time();

// This line is the longest, and the highlighting works
        $x = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"."xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx";

// Comparing these two lines, we can see, that the line length is shorter, but the amount of elements to concatenate is higher and it fails. Interstingly, Atom seems to assume, that the rest of the line seems to be of the same entity than the last element it could successfully parse
        $x = "a" . "b" . "a" . "b" . "a" . "b" . "a" . "b" . "a" . "b" . "a" . "b" . "a" . "b" . "a"."xxx" . "a" . "b" . "a" . "b" . "a" . "b" . "a" . "b" . "a" . "b" . "a" . "b";
        $x = "a" . "b" . "a" . "b" . "a" . "b" . "a" . "b" . "a" . "b" . "a" . "b" . "a" . "b" . "a"."xxx". "a" . "b" . "a" . "b" . "a" . "b" . "a" . "b" . "a" . "b" . "a" . "b";

Maybe that helps someone out there!

Best,
YB


#3

Having this same problem. Also in javascript files. Additionally, the syntax highlighting fails for the rest of the file, not just the rest of the line. I’ve tried it with multiple themes. 0.80.0


#4

Several updates later… still having the same issue.

Anyone have any fixes yet?


#5

Here’s your culprit:

A limit of 100 tokens per line is hardcoded in the Syntax constructor. I guess performance is one of the reasons for this limit (like with minified javascript files). Maybe it could be available as a setting rather than having it hardcoded like that.

@ProbablyCorey, @kevinsawicki, @nathansobo: Do you see any reasons that prevent it to be a setting?


Incorrect syntax highlighting in Javascript source files
#6

We were hoping to fix it so that Atom wouldn’t need max tokens per line at all by moving the tokenizing out of process or exploring other ways to remove this limit completely. Unfortunately there hasn’t been much progress on this yet.

The downside to a setting is that certain grammars are bound to have some slow patterns for long lines, so even if you set it to something that works for JavaScript; CSS or HTML might have issues with noticeable lag parsing long lines.

Another option would be that each grammar could set a default that it supports pretty well for its current patterns. This setting can be specified in the grammar’s cson file and it takes precendence over the value in the Syntax constructor, https://github.com/atom/first-mate/blob/master/src/grammar-registry.coffee#L162

I’d really like to fix this once and for all though and just get tokenizing done off the main thread so this won’t ever be an issue for any grammars ever again.


HTML syntax coloring breaks on inline SVG
#7

I see, it’s quite the tricky issue, as some grammars can rely on a lot of included rules the amount of tokens generated wil grow as well (like a ruby string with interpolation containing another string with an interpolation, etc.). Taking the tokenization out of the main thread seems clearly the best course of action.


#8

This also happens in PHP, see attached:

That long block causes the following div to be mis-colored.


#9

This is an issue with the grammar engine itself. It will occur with all grammars.


#10

Is this something we can override somewhere, per language or globally? Or is there a “fix” planned?


#11

See above:


#12

Hi gang! My first post… looks like I found the cause of my issue… easy to find… due to nice forum.

So, I’ll just add my issue picture to the pile, Atom 1.0.14, language-javascript 0.94.0. I hope this can be worked-around someday, but know that I LOVE ATOM no matter what minor issues it has. LOVE IT!

If anyone would like the JS file that causes this, just holler.

Thanks for the swell forum, guys/gals.
Wingnut - Age 58 - Michigan USA - junior-grade docs janitor for BabylonJS.


#13

Have you tried updating to the latest version of Atom? We’re on v1.6.2 as of this writing.


#14

Thanks for the reply and link, @leedohm! Apparently my “check for updates” was lying to me.
Now running 1.6.2 but, unfortunately, same symptom.

10+ files in my project, tons of JS, but one tiny section of one file… has a problem. scratch scratch

Culprit file: http://webpages.charter.net/wingthing/moo/wotas/buildgame.js

I just bet… it’s ME causing this, somehow. I somehow placed some disease inside that file. :cold_sweat:


#15

On a deeper look, it appears that you’re running into the problem where Atom’s grammar parser can’t handle more than 100 tokens on a single line. If your individual lines were shorter, things wouldn’t go wonky.


#16

nod. Yep, that’s why I posted in this thread, adding to the pile. :slight_smile: I was sort-of hoping that the limitation had been extended or worked-around, but I guess not. Maybe not possible.

I can’t really change my coding style just to satisfy a syntax highlighter.

Okay, maybe I can, seeing it is such a small section of the project that is going “wonky” :slight_smile:

Thanks for the replies and info, leedohm! And thanks for being a forum helper/custodian!

update: I did Select Grammar -> C… and that file looks pretty good. Weird. Still SOME issues, though, but no reversed text (phew, that sucked!) @leedohm - if you don’t mind me asking… did you load that JS file into YOUR Atom, and choose the auto-lang or JS lang (grammar), and get the exact same symptom as was shown in my picture? Or, was it different? Or…? (thx)


#17

Actually, I didn’t have a chance to try to replicate it. I’m traveling this week so I’m a bit squeezed for time compared to my normal schedule.