Why does Vims errorformat not accept regular expressions?

Vims errorformat (for analyzing compilation / assembly errors) uses c secret format for error analysis.

Trying to configure errorformat for nant seems almost impossible, I tried for many hours and can't get it. I also see from my searches that many people seem to be experiencing the same problem. A regular expression for this would require a minutesto entry.

So why is vim still using this format? It is possible that the C-parser is faster, but it hardly seems relevant for something that happens every few minutes at most. Is there a good reason or is it just a historical artifact?

+4
source share
5 answers

It is not that Vim uses the secret format from C. Rather, it uses the ideas from scanf , which is a function of C. This means that the line corresponding to the error message consists of three parts:

  • space
  • characters
  • conversion specifications

Space is your tabs and spaces. Characters are letters, numbers, and other normal things. Conversion specifications are sequences starting with the% symbol (%). In scanf, you usually map the input string to% d or% f to convert to integers or floats. In the Vim error format, you are looking for an input line (error message) for files, lines, and other information related to the compiler.

If you used scanf to extract an integer from the string "99 beer bottles", you should use:

 int i; scanf("%d bottles of beer", &i); // i would be 99, string read from stdin 

Now with the Vim error format, this is getting a little more complicated, but it is trying to easily match more complex patterns. Things like multi-line error messages, file names, directory changes, etc. Etc. One example help for errorformat is useful:

 1 Error 275 2 line 42 3 column 3 4 ' ' expected after '--' The appropriate error format string has to look like this: :set efm=%EError\ %n,%Cline\ %l,%Ccolumn\ %c,%Z%m 

Here% E tells Vim that this is the beginning of a multi-line error message. % n is the error number. % C is the continuation of the multi-line message,% l is the line number and% c is the column number. % Z marks the end of the multi-line message, and% m corresponds to the error message that will be displayed in the status bar. You need to avoid backslashes, which adds a little extra oddity.

It might look simpler at first with a regular expression, this mini-language is specifically designed to help with compiler error matching. It has a lot of shortcuts. I mean, you don’t have to think about things like matching multiple lines, multiple numbers, matching path names (just use% f).

Another thought: how would you display numbers as line or line numbers to indicate files or error messages if you would use only regular regular expression? By group position? This may work, but it will not be very flexible. Another way can be called capture groups, but then this syntax is much like a short hand. In fact, you can use regexp wildcards such as .* - %.%# Is written in this language.

OK, so this is not perfect. But this is not impossible and makes sense in its own way. Stuck, read help and stopped complaining !:-)

+7
source

I would recommend writing a post-processing filter for your compiler that uses regular expressions or something else, and displays messages in a simple format that is easy to write errorformat for it. Why learn a new, baroque, single-purpose language if you don't need to?

0
source

According to :help quickfix ,

You can also specify (almost) any expression supported by Vim in format strings.

However, the documentation is confusing, and I did not spend much time checking how well it works and how useful it is. You still need to use scanf-like codes to pull out file names, etc.

0
source

They are ill for work, but to be clear: you can use regular expressions (mostly).

From the docs:

 Pattern matching The scanf()-like "%*[]" notation is supported for backward-compatibility with previous versions of Vim. However, it is also possible to specify (nearly) any Vim supported regular expression in format strings. Since meta characters of the regular expression language can be part of ordinary matching strings or file names (and therefore internally have to be escaped), meta symbols have to be written with leading '%': %\ The single '\' character. Note that this has to be escaped ("%\\") in ":set errorformat=" definitions. %. The single '.' character. %# The single '*'(!) character. %^ The single '^' character. Note that this is not useful, the pattern already matches start of line. %$ The single '$' character. Note that this is not useful, the pattern already matches end of line. %[ The single '[' character for a [] character range. %~ The single '~' character. When using character classes in expressions (see |/\i| for an overview), terms containing the "\+" quantifier can be written in the scanf() "%*" notation. Example: "%\\d%\\+" ("\d\+", "any number") is equivalent to "%*\\d". Important note: The \(...\) grouping of sub-matches can not be used in format specifications because it is reserved for internal conversions. 
0
source

lol try looking at the actual vim source code. This is a nest of C code, old and incomprehensible, you will think that you are at an archaeological site.

As to why vim uses a C-parser, there are many good reasons, starting with the fact that it is pretty versatile. But the real reason is that once in the last 20 years someone wrote this to use the C parser and it works. Nobody changes what works.

If this does not work for you , the vim community will tell you to write your own. Stupid open source bastards.

-3
source

All Articles