Is it possible to decrypt prettify.js to support Mathematica?

mathematica.SE is currently in private beta and will open to the public in a few days. Stack overflows and related sites use prettify.js , however Mathematica is not supported. It would be great to have custom script highlighting for our site, and I ask the JavaScript and CSS community for help in developing such a script and accompanying CSS.

Some basic requirements are listed below, so that it captures most of the Mathematica default allocation scheme functions (ignoring information that only the internal parser will know). I also named colors in general - hexadecimal color codes can be selected from the screenshots that I provided (below). I also added code samples to accompany the screenshots so people can check it out.

Primary requirements

  • Comments
    They are entered as (* comment *) . Therefore, everything in between should be highlighted in gray.

  • Lines
    They are entered as "string" (single quotes are not supported) and should be highlighted in pink.

  • Operators / Short Notation
    In addition to the standard +, -, *, /, ^, == , etc., Mathematica has several other operators and short notation. The most common are:

     @, @@, @@@, /@, //@, //, ~, /., //., ->, :>, /:, /;, :=, :^=, =., &, |, ||, &&, _, __, ___, ;;, [[, ]], <<, >>, ~~, <> 

    These and brackets, brackets and braces should be highlighted in black.

  • Object and Slot Templates
    Template objects begin with a letter and have either _ , or __ or ___ , such as x_ , x__ and x___ . They can also have extra letters after underscores, like x_abc , etc. All of them should be highlighted in green.

    Slots # and ## , and it can be followed by an integer #1 , ##4 , etc., and should also be green.

    Both of these objects (template objects and slots) are usually completed with the operator / bracket / short form from paragraph 3. above.

  • Functions / Variables
    Functions and variables here are pretty loose terminology, but serve for the purposes of this publication. Anything that does not fall into the above 4 can be highlighted in black. Mathematica often uses backticks in code and should be considered part of the function / variable name. For example, abcd`defg . Dollar signs $ anywhere in the variable name should be treated exactly like a letter (i.e., nothing special).

For all of the above, if they appear inside the lines, they should be considered as such, i.e. "@~# should be highlighted in pink.

Additional nice features:

  • In template objects in paragraph 3 above, if the underscore (s) is followed by ? and then some letters, then the part following _ should be black. For example, in x__?abc x__ part should be green, and ?abc should be black.
  • If a function / variable starts with a capital letter, then it is highlighted in black. If it starts with a small letter, it is highlighted in blue. Internally, this distinguishes built-in functions from user-defined functions. However, the math community (almost everywhere) adheres to this naming convention quite well, so distinguishing the two will serve a specific purpose.

Screenshots and code examples:

1. Simple examples

Here is a small example containing a screenshot at the end showing how it looks in Mathematica:

 (*simple pattern objects & operators*) f[x_, y__] := x Times @@ y (*pattern objects with chars at the end and strings*) f[x_String] := x <> "hello@world" (*pattern objects with ?xxx at the end*) f[x_?MatrixQ] := x + Transpose@x << Combinatorica` (*example with backticks and inline comment*) (*Slightly more complicated example with a mix of stuff*) Developer`PartitionMap[Total, Range@1000, 3][[3 ;; -3]]~Partition~2 // Times @@@ # & 

enter image description here

2. Real world example

Here is an example from this my answer , which also points to my point 2 in the section “Extra nice things”, i.e. inline elements are highlighted in blue.

In addition, you may notice some of the variables highlighted in orange - I deliberately did not include this as a requirement, since it seems to me that it is much more difficult to do without a parser that Mathematica knows.

 prob = MapIndexed[#1/#2 &, Accumulate[ EuclideanDistance[{0, 0}, #] < 1 & /@ arrows // Boole]]~N~4; Manipulate[ Graphics[{White, Rectangle[{-5, -5}, {5, 5}], Red, Disk[{0, 0}, 1], Black, Point[arrows[[;; i]]], Text[Style[First@prob[[i]], Bold, 18, "Helvetica"], {-4.5, 4.5}]}, ImageSize -> 200], {i, Range[2, 20000, 1]}, ControlType -> Manipulator, SaveDefinitions -> True] 

enter image description here

Is it possible? Too much? Too complicated? Impossible?

Quite frankly, I do not know the answer to any of them. I just listed some basic functions that everyone on .SE math would like to have some additional things that would be cherries on top. However, let me know if they are too difficult to implement. We can develop a smaller set of functions.

In recognition of this help, you all have eternal gratitude to the Mathematica community, and in addition, I will reward 500 bonuses for each person who makes a significant contribution to this (if this is done in parts by different people). I will rely on your votes / comments / answers on the answers to decide which one is significant (perhaps more than one bounty for one person, if they do all the work). Implementation of "Extra pleasant to use" gets automatic +500 regardless of previous bonuses , so you can also rely on the work of others, even if you do not do the first half. I could also periodically post smaller rewards to attract users who might not have seen this issue, so if you happen to earn these rewards, they will be in addition to the “generosity to reward the existing answer” that will be resolved until the end ,

Finally, I'm not in a hurry. Therefore, please do not rush with this question. A bounty is always an option until it is implemented by SE (or if it has been determined that the existing answers fully satisfy the requirements). Ideally, I hope that this is implemented 2 / 3rs of our path to the beta version, which runs through 2 months.

+55
javascript css wolfram-mathematica prettify
Jan 21 2018-12-21T00:
source share
2 answers

Introduction

Since Mathematica Support for google-code-prettify was mainly developed for the new Mathematica.Stackexchange , see also the discussion here .

Introduction

I don’t have deep knowledge about all this, but there were times when I wrote the cweb plugin for Idea so that my code could be highlighted there. In the IDE, this is not a one-step process. It is divided into several stages, and each step has more lighting options. Let me explain this a bit to give a few reasons why some things (imho) are not possible for the dedicated code that we need here.

First, the code is divided into tokens, which are separate parts of the programming language. After this lexer, you can classify your code intervals, for example. space, literal, line, comment, etc. This lexer uses the source code by checking regular expressions, preserving the type of token for the text range, and taking a step forward in the code.

After this lexical scan, the source code can be analyzed using the rules of a programming language, tokens, and base code. For example, if we have a Plus token that is of type Keyword , then we know that the brackets and parameter must follow. If not, the syntax is incorrect. What you can build using this parsing is called the AST abstract syntax tree, and looks basically like Mathematica's TreeForm syntax.

With a well-developed language, such as Java, you can check the code during input and make it almost impossible to write syntactically incorrect code.

prettify.js and Mathematica Code

Firstly, prettify.js implements only a lexical scanner, but not a parser. I am sure that in any case this will not be possible with regard to the time limits for displaying a web page. So let me explain which functions are not possible / possible with prettify.js:

Also, you may notice some of the variables highlighted in orange - I deliberately did not include this as a requirement, since I think it will be much harder to do without a parser that Mathematica knows.

That's right, because the allocation of these variables depends on the context. You should know that you are inside a Table construct or something like that.

Hacking prettify.js

I think hacking the extension for prettify.js is not that difficult. I am an absolute noob regex, so be prepared for what follows.

We do not need so much material for the simple vocabulary of Mathematica. We have spaces, comments, string literals, curly braces, lots of operators, regular literals like variables and a giant keyword list.

Let's start with the keywords in the java-script regexp-form:

 Export["google-code-prettify/keywordsmma.txt", StringJoin @@ Riffle[Apply[StringJoin, Partition[Riffle[Names[RegularExpression["[AZ].*"]], "|"], 100], {1}], "'+ \n '"], "TEXT"] 

The regular expression for spaces and string literals can be copied from another language. Comments are matched with something like

 /^\(\*[\s\S]*?\*\)/ 

This happens incorrectly if we have comments inside the comments, but at the moment I don't care. We have brackets and brackets

 /^(?:\[|\]|{|}|\(|\))/ 

We have something like blub_boing that needs to be matched separately.

 /^[a-zA-Z$]+[a-zA-Z0-9$]*_+([a-zA-Z$]+[a-zA-Z0-9$]*)*/ 

We have slots #, ##, # 1, ## 9 (currently only one digit can follow)

 /^#+[0-9]?/ 

We have variable names and other literals. They need to start with the letter or $, and then follow the letters, numbers and $. Currently, \[Gamma] not mapped as a single literal, but at the moment this is normal.

 /^[a-zA-Z$]+[a-zA-Z0-9$]*/ 

And we have operators (I'm not sure if this list is complete).

 /^(?:\+|\-|\*|\/|,|;|\.|:|@|~|=|\>|\<|&|\||_|`|\^)/ 

Update

I cleaned the material a bit, debugged it a bit, and created a color style that looks beautiful to me. The following things work as far as I can see correctly:

  • All system characters that can be found through Names[RegularExpression["[AZ].*"]] mapped and highlighted in blue
  • Brackets and brackets are black but bold. This was an offer from Szabolcs, and I really like it, as it definitely adds some energy to the appearance of the code.
  • Templates, as they appear in function definitions, and slots with pure functions are highlighted in green. This was suggested by Yoda and comes with a marker in the Mathematica interface. Templates are only green in combination with a variable, as in blub__Integer , a1_ or in b34_Integer32 . The test functions for the template, as in num_?NumericQ , are only the green borders of the question mark.
  • Comments and lines are the same color. Comments and lines can go through several lines. Lines may include backslash quotes. Comments cannot be nested.
  • For coloring, I consistently used the ColorData[1] scheme ColorData[1] , so that the colors looked beautiful next to ColorData[1] other.

Currently it looks like this:

enter image description here

Testing and debugging

Sabolch asked if it could be checked and how. It is very simple: you need my source for google-code-prettify code ( Where can I put this so that everyone has access? ). Unzip the sources and open the tests/mathematica_test.html file in a web browser. This file loads the src/prettify.js , src/lang-mma.js and src/prettify-mma-1.css .

  • in lang-mma.js you find the regular expression that the lexer uses when breaking code into tokens.
  • in prettify-mma-1.css you will find the style definitions that I use

To test your own code, simply open mathematica_test.html in the editor and paste the material between the pre tags. Reload the page and your code should appear.

Debugging: If the marker does not work correctly, you can debug it using the IDE or using Google Chrome. In Chrome, you mark a word in which the marker starts to crash and makes a right click and Inspect Element . What you see then is the basic html-highlight code. There you can see each individual token, and you see what type of token. Then it looks like

 <span class="tag">[</span> 

You see that the open bracket is of type tag . This matches the regexp definition I made in lang-mma.js . In Chrome, you can even view JS code, set breakpoints and debug it when your page reloads.




Local installation for Google Chrome and Firefox

Tim Stone was so kind as to write a script that injects a marker when loading sites under http://stackoverflow.com/questions/ . Once google-code-prettify is enabled for mathematica.stackexchange.com , it should work too. I adapted this script to use my lexical scanning rules and colors. I heard that the script does not always work in Firefox, but here's how to install it:

Version

At https://github.com/halirutan/Mathematica-Source-Highlighting/raw/master/mathematica-source-highlighter.user.js you will always find the latest version. Here is the change history. - 02/23/2013 Updated lists of symbols and keywords in Mathematica version 9.0.1 - 09/02/2012 some minor problems with coloring Mathematica templates were fixed. Detailed feature overview with Pattern operator : see also here

  • 02/02/2012 support for many number input formats, such as .123`10.2 or 1.2`100.3*^-12 , highlighting In[23] and Out[4] , ::usage or other messages, such as blub::boing , highlighting patterns such as ProblemTest[prob:(findp_[pfun_, pvars_, {popts___}, ___]), opts___] , error correction (I checked the parser for 3500 lines of package code from the AddOns directory. It took about 3- 4 seconds, which for our purposes should be more than fast enough).
  • 01/30/2012 Fixed the absence of '?' in the list of operators. Named characters are included, such as \\[Gamma] , to give full correspondence to such characters. Added $ variables to keyword list. Improved pattern matching. Added mapping of context constructs such as Developer`PackedArrayQ. Switching color scheme due to many requests. Now it looks like a Mathematica interface. Keywords black, variable blue.
  • 09/29/2012 Tim hacked the injection code. Now highlighting works on mathematica.stackexchange too.
  • 01/25/2012 Mathematica number recognition added. Now you should highlight such things as {1, 1.0, 1., .12, 16^^1.34f, ...} . In addition, he must recognize the return line for the number. I switched the comments and lines to gray and used a deep red color for the numbers.
  • 01/23/2012 Original version. The options are described in the Update section.
+43
Jan 22 2018-12-12T00:
source share

Not quite what you are asking for, but I created a similar extension for MATLAB (based on the excellent work already done here). The project is posted on github .

The script should solve some problems common to MATLAB code when stack overflows:

  • comments (no need to use tricks like %# .. ) Transpose operator
  • (single quote) is correctly recognized as such (gets confused with quoted strings using the default prefixer)
  • highlighting popular built-in functions

Remember that syntax highlighting is not perfect; among other things, it fails in nested block comments (I can live with this for now). As always, comments / corrections / problems are welcome.

A separate user pointer is included, it allows you to switch the language used, as shown in the screenshot below:

--- before ---

before

--- after ---

after

For those interested, a third user guide is provided, adapted to work on the MATLAB Answers Web site.




TL; DR

Install usercript for SO directly from:

https://github.com/amroamroamro/prettify-matlab/raw/master/js/prettify-matlab.user.js

+2
May 4 '12 at 11:38
source share



All Articles