Writing text with diacritical ("nikud", vocalization marks) using PIL (Python Imaging Library)

Question

Writing text with diacritical ("nikud", vocalization marks) using PIL (Python Imaging Library)

Writing plain text on an image using PIL is easy.

draw = ImageDraw.Draw(img) draw.text((10, y), text2, font=font, fill=forecolor )

However, when I try to write Hebrew punctuation marks (called "nowhere" or ניקוד), the characters do not overlap as they should. (I would suggest that this question is relevant in Arabic and other similar languages.)

In a supportive environment, these two words occupy the same space / width (the example below depends on your system, hence the image):

סֶפֶר ספר

However, when drawing text using PIL, I get:

ס ֶ פ ֶ ר

since the library is probably not subject to kerning rules (?).

Is it possible for the Hebrew and Punctuation characters to occupy the same space / width without manually entering character positioning?

image - nikud and the distance between the letters http://tinypic.com/r/jglhc5/5

image url: http://tinypic.com/r/jglhc5/5

+7

python fonts unicode python-imaging-library hebrew

Berry tsakala Jun 14 '09 at 17:28

source share

4 answers

Regarding Arabic diacritics: Python + Wand (Python Lib) + arabic_reshaper (Python Lib) + bidi.algorithme (Python Lib). The same applies to PIL / Pillow , you need to use arabic_reshaper and bidi.algorithm and pass the generated text to draw.text((10, 25), artext, font=font) :

 from wand.image import Image as wImage from wand.display import display as wdiplay from wand.drawing import Drawing from wand.color import Color import arabic_reshaper from bidi.algorithm import get_display reshaped_text = arabic_reshaper.reshape(u'لغةٌ عربيّة') artext = get_display(reshaped_text) fonts = ['C:\\Users\\PATH\\TO\\FONT\\Thabit-0.02\\DroidNaskh-Bold.ttf', 'C:\\Users\\PATH\\TO\\FONT\\Thabit-0.02\\Thabit.ttf', 'C:\\Users\\PATH\\TO\\FONT\\Thabit-0.02\\Thabit-Bold-Oblique.ttf', 'C:\\Users\\PATH\\TO\\FONT\\Thabit-0.02\\Thabit-Bold.ttf', 'C:\\Users\\PATH\\TO\\FONT\\Thabit-0.02\\Thabit-Oblique.ttf', 'C:\\Users\\PATH\\TO\\FONT\\Thabit-0.02\\majalla.ttf', 'C:\\Users\\PATH\\TO\\FONT\\Thabit-0.02\\majallab.ttf', ] draw = Drawing() img = wImage(width=1200,height=(len(fonts)+2)*60,background=Color('#ffffff')) #draw.fill_color(Color('#000000')) draw.text_alignment = 'right'; draw.text_antialias = True draw.text_encoding = 'utf-8' #draw.text_interline_spacing = 1 #draw.text_interword_spacing = 15.0 draw.text_kerning = 0.0 for i in range(len(fonts)): font = fonts[i] draw.font = font draw.font_size = 40 draw.text(img.width / 2, 40+(i*60),artext) print draw.get_font_metrics(img,artext) draw(img) draw.text(img.width / 2, 40+((i+1)*60),u'ناصر test') draw(img) img.save(filename='C:\\PATH\\OUTPUT\\arabictest.png'.format(r)) wdiplay(img)

+8

Nasser al-wohaibi Sep 08 '14 at 14:59

source share

What system do you work on? This works for me on my Gentoo system; the order of letters is canceled (I just copied from your question), which seems right for me, although I know little about RTL languages.

 Python 2.5.4 (r254:67916, May 31 2009, 16:56:01) [GCC 4.3.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import Image as I, ImageFont as IF, ImageDraw as ID >>> t= u"סֶפֶר ספר" >>> t u'\u05e1\u05b6\u05e4\u05b6\u05e8 \u05e1\u05e4\u05e8' >>> i= I.new("L", (200, 200)) >>> d= ID.Draw(i) >>> f= IF.truetype("/usr/share/fonts/dejavu/DejaVuSans.ttf", 20) >>> d1.text( (100, 40), t, fill=255, font=f) >>> i.save("/tmp/dummy.png", optimize=1)

gives:

sample text done in white on black http://i39.tinypic.com/2j9jxf.png

EDIT: I have to say that using the font Deja Vu Sans not accidental; although I don’t really like it (and yet, I find its glyphs better than Arial), it reads, it extends Unicode coverage and seems to work better with many non-MS applications than Arial Unicode MS .

+2

tzot Jun 19 '09 at 21:26

source share

It seems to me that the matter is quite simple. You can use True Type fonts and use

Here's an example: True Type Fonts for PIL

Here you can find the Hebrew True Type fonts : Hebrew true fonts

Good luck or as we speak Hebrew - Mazal 'Tov.

0

Roman kagan Jun 15 '09 at 0:10

source share

Berry tsakala · Accepted Answer · 2014-09-09T18:22:34+0000

funny, after 5 years, and with a lot of help fron @Nasser Al-Wohaibi, I figured out how to do this:

You need to access the text back using the BIDI algorithm.

 # -*- coding: utf-8 -*- from bidi.algorithm import get_display import PIL.Image, PIL.ImageFont, PIL.ImageDraw img= PIL.Image.new("L", (400, 200)) draw = PIL.ImageDraw.Draw(img) font = PIL.ImageFont.truetype( r"c:\windows\fonts\arial.ttf", 30) t1 = u'סֶפֶר ספר!' draw.text( (10,10), 'before BiDi :' + t1, fill=255, font=font) t2 = get_display(t1) # <--- here the magic <--- draw.text( (10,50), 'after BiDi: ' + t2, fill=220, font=font) img.save( 'bidi-test.png')

@ The answer to Nasser has additional value, which probably applies only to Arabic texts (letters in the Arabic form of change and connectivity based on their neiboring letters, in Hebrew all letters are separate), so only the bidi part was relevant for this question.

in the selection results, the 2nd row is the correct shape and the correct positioning of the vocalization marks.

Thanks to @tzot for help + code snippets

a-rgorosis:

Samples of the different behavior of the Hebrew font "nowhere." Not all fonts behave the same:

Writing text with diacritical ("nikud", vocalization marks) using PIL (Python Imaging Library)

More articles: