Integrating Compiler Messages
October 23, 2007
Attention to details is ok, but compiler messages has historically not received it. Here’s an example of GCC’s output:
qt/src/xml/query/expr/qcastingplatform.cpp: In member function 'bool CastingPlatform::prepareCasting():
qt/src/xml/query/expr/qcastas.cpp:117: instantiated from here
qt/src/xml/query/expr/qcastingplatform.cpp:85: error: no matching function for call to 'locateCaster(int)'
qt/src/xml/query/expr/qcastingplatform.cpp:93: note: candidates are: locateCaster(const bool&)
Typically compiler messages have been subject to crude printf approaches and dignity has been left out: localization, translation, consistency in quoting style (for instance), adapting language to users (e.g, to not phrase things preferred by compiler engineers), good English, and just generally looking sensible.
To solve that it requires quite some work, and that’s probably the explanation to why it often is left out. To have line numbers, error codes, names of functions, and whatever available and flowing through the system requires quite some plumbing and room in the design.
Another thing is that nowadays we really should expect that compiler messages within IDEs or other graphical applications should be sanely typeset. If not, we’ve lost ourselves in all this UNIX stuff. Keywords and important phrases should be italic, emphasized, colorized depending on the GUI style.
For shuffling compiler messages around it is customary to pass a set of properties: a URI, line number, column number, a descriptive string, and possibly an error code. Apart from that it falls short reaching the goals outlined in this text, it encounters a problem which I think is illustrated in the above example with GCC. What does one do if the message involves several locations?
Even if a message involves several locations, it is still one message and should be treated so, and presented as so. The approach of using a struct with properties falls short here, and chops the message into as many parts as it has locations.
For Patternist I wanted to make an attempt at improving messages. So far it is an improvement at least. For instance, for this message that the command line tool patternist outputs:
the installed QAbstractMessageHandler was passed a QSourceLocation and a message which read:
<p>Operator <span class='XQuery-keyword'>+</span> is not available between atomic values of type <span class='XQuery-type'>xs:integer</span> and <span class='XQuery-type'>xs:string</span>.</p>
It was subsequently converted to local encoding and formatted with ECMA-48 color codes. (The format is not spec’d yet, it will probably be XHTML with specified class ids.)
While using markup for the message is a big improvement, it opens the door for formatting and all, this API still has the problem of dealing with multiple locations.
What is the solution to that?
Striking the balance between programmatic interpretation(such that for instance source document navigation is doable) and that the message reads naturally as one coherent unit is to… maybe duplicate the information, but each time tailored for a particular consumer?
<p xmlns:l="http://example.com/">In my <l:location href="myDocument.xml" line="57" column="3">myQuery.xq at line 57, column 3</l:location>, function <span class="XQuery-keyword">fn:doc()</span> failed with code <span class="XQuery-keyword">XPTY0004</span>: the file <l:location href="myDocument.xml" line="93" column="9">myDocument.xml failed to parse at line 93, column 9</l:location>: unexpected token <span class="XQuery-keyword">&</span>.</p>
This is complicated by that language strings cannot be concatenated together since that prevents translation. But I think the above paragraph is possible to implement. As above, the message reads coherently, but still allows programmatic extraction. A language string and formatted data sits in opposite corners of extremity, and maybe markup is the balance between them.
Would this give good compiler messages and allow slick IDE integration? If not, what would?

October 23, 2007 at 10:38 am
Localization and translation of error-messages is a bad thing, which has prolonged my hunt for answers several times. Instead of searching using the error-message output, which for Norwegian wouldn’t probably yield to many results, you first have to find the english translation of that error-message before you have any hope of any results.
Regular users don’t use compilers, and developers shouldn’t have too much trouble understanding a few english words…
October 23, 2007 at 2:29 pm
as Sjaddow says – please don’t localize compiler-error-messages!
I did allready several times switch the locale to english and query google with this error-message – the german one didn’t help at all
October 23, 2007 at 3:33 pm
If you are looking at better error message, look at TeX and Intel’s C++ compiler. They include the source code and a line break (TeX) or a caret (icc) in the error message.
Oh, well, not IDE friendly.
October 23, 2007 at 4:02 pm
> Localization and translation of error-messages is a bad thing
Too true. It’s a royal PITA.
October 23, 2007 at 6:06 pm
maybe you already know about this feature already … maybe not, so I’ll tell you anyway just to be sure …
you can switch the language for excecuting individual commands like this:
LANG=c make
or
LANG=no g++ trollspeak.cpp
c means “no translation at all”. so with having this feature errors can be understood by programmers not fluent in english without using LANG (default) or by google with using LANG
hope that helps some of you
October 25, 2007 at 2:32 am
@Simon: The problem isn’t changing the language (usually you change LC_MESSAGES for this purpose), but users filing bugs in N++ languages you don’t understand and having them to tell to translate or to change their language to English and try to reproduce the failure, so you as the one supposed to judge the bug, understand it. The problem: It costs time and the user may not be able or willing to reproduce, despite a potential valid bug he stumbled about.
October 26, 2007 at 4:53 am
@{Simon,required}: Not only that users have to translate their messages to be heard when asking questions. What happens if that user, giving an e.g. spanish localized error message, gets a useful answer that would help you, too? You’ll never know of it, because you don’t look for the spanish translation of your error message.
One way to prevent this AND have translations available would be to have some kind of unique error key, e.g.:
qt/src/xml/query/expr/qcastingplatform.cpp: ERROR GCC_79546731
qt/src/xml/query/expr/qcastingplatform.cpp: In member function ‘bool CastingPlatform::prepareCasting():
qt/src/xml/query/expr/qcastas.cpp:117: instantiated from here
qt/src/xml/query/expr/qcastingplatform.cpp:85: error: no matching function for call to ‘locateCaster(int)’
qt/src/xml/query/expr/qcastingplatform.cpp:93: note: candidates are: locateCaster(const bool&)
lg
Erik
August 30, 2008 at 8:47 am
steven girl jhon are busy kitchen