Mercurial commit messages and log, what encoding is supported, is Hg really?

I tried to make a simple commit through my wrapper library for Mercurial, using plain Unicode:æøåÆØÅ text Unicode:æøåÆØÅ as a commit message. This is written to a text file and assigned to Mercurial with the appropriate parameter:

 hg commit --logfile FILE 

If I later watch the repository with TortoiseHg, the characters will play correctly. On the console, they are distorted:

  [C: \ Temp]: hg log
 changeset: 0: 6a0911410128
 tag: tip
 user: Lasse V. Karlsen 
 date: Wed Dec 01 21:48:54 2010 +0100
 summary: Unicode: æøåÆØÅ

If I redirect the output of hg log to a file and open it, æøåÆØÅ plays correctly.

So my question is this:

  • Can I ask hg to write the log to a file directly or do I need to redirect standard output?
  • This will cause python encoding problems for the console, i.e. will some characters crash hg instead of just outputting the result?
  • Is there a known supported commit message encoding that I have to stick to?

Or is it just like this:

  • Mercurial doesn't care, it takes the contents of the file that I give it, regardless of the contents, and saves it as a commit message. When you create a log, it just dumps it back to the console, falling prey to any restrictions available in this area for the Python console output library?
+6
mercurial unicode
source share
1 answer

The following may not solve the problem, but may help debug it.

Check out: https://www.mercurial-scm.org/wiki/EncodingStrategy

If I redirect the output of the hg log to a file and open it, æøåÆØÅ plays correctly.

Thus, at least mercurial correctly stores commit information. This is only a way out that is ruined.

Some work takes place on these lines, but is not related to this.

[Edit: missed the fact that you are on the windows]

See the last paragraph on how to deal with character set compatibility issues: https://www.mercurial-scm.org/wiki/CharacterEncodingOnWindows

It says:

  • set the console code page according to your system code page
  • override Mercurial encoding with environment variable
    • Setting HGENCODING will override the detected system character set.
  • override Mercurial encoding using command line option
    • Using the global option -encoding will allow you to set the preferred encoding for each command.
  • use GUI tools to interact with Mercurial
    • It also fixes the problem by completely eliminating this annoying console.
  • use Linux / UNIX and UTF-8
    • It makes Bill Gates cry.
+8
source share

All Articles