«Developing Arabic fonts»: الفرق بين المراجعتين
ط |
ط |
||
(10 مراجعات متوسطة بواسطة مستخدمين اثنين آخرين غير معروضة) | |||
سطر 1: | سطر 1: | ||
{{Inprogress}} |
{{Inprogress}} |
||
− | <div class="english"> |
||
Arabic is said to be complex script, compared to simple scripts like Latin and Cyrillic. To render Arabic text correctly we need more sophisticated font technologies, what is so called `smart fonts', like OpenType, AAT from Apple and |
Arabic is said to be complex script, compared to simple scripts like Latin and Cyrillic. To render Arabic text correctly we need more sophisticated font technologies, what is so called `smart fonts', like OpenType, AAT from Apple and |
||
+ | SIL's graphite. |
||
− | SIL's graphite. In this tutorial, we will describe briefly features of Arabic script and the steps of developing Arabic OpenType fonts. |
||
= Features of Arabic script = |
= Features of Arabic script = |
||
⚫ | Arabic is a cursive script written from right to left. Each letter has 4 basic forms (with few exceptions); initial, middle, final and isolated forms, each of |
||
⚫ | |||
− | One of the interesting features in Arabic script is ligatures, a ligature is a glyph composed of two characters or more, which wen combined together take a different shape other than the shapes of its separated component. Lam-alef is considered as a must have ligature, though a high quality font must have much more ligatures. OpenType's ligature substitution provide a way to address such feature. |
||
+ | In OpenType fonts, there are lookup tables that define the various forms of each character, basically initial, medial and final forms simple substiution tables. |
||
− | Another feature of the Arabic script is the extensive use of diacritic marks known |
||
⚫ | |||
− | each other in relation to one base glyph, it is important to address mark to mark |
||
− | relations. To define mark to base and mark to mark relations, we will use anchor |
||
− | points, a way that OpenType provides for defining glyph to glyph relations. |
||
+ | [[image:lam-alef.png|thumb|right|50px]] |
||
− | = Arabic Unicode = |
||
+ | Another feature is ligatures (though the name ligature is misleading, as those ligatures are mandatory in Arabic, unlike Latin ligatures). Basically, the ligature is a special glyph that is composed of two or more glyphs. Take lam-alef as an example, it is a glyph that is composed of lam (initial or medial form) with alef (final form). However, if we looked carefully, we will see that lam-alef is no exception, just a special form of lam that it only takes when followed by alef, the same for the alef. In OpenType, lam-alef and like can be addressed either by using ligatures, or more complex contextual substitution. |
||
− | Unicode defines 3 Arabic blocks, 0600-06FF (and 0750-077F) code the actual Arabic characters (i.e. what Arabic text should be encoded with), other blocks are called presentation forms, Arabic Presentation forms A (FB50-FDFF) and B (FE70-FEFF), those blocks define the different shapes of Arabic glyphs like initial, medial, final and isolated forms in addition to some ligatures. The need for presentation forms in Unicode is controversial, as text should never be encoded with those code points, freedesktop's bug [https://bugs.freedesktop.org/show_bug.cgi?id=8195 #8195] shows an example of the improper use of presentation forms, not to mention the incompleteness of the clouded ligatures and the completely unnecessary ones. |
||
⚫ | Another feature of the Arabic script is the extensive use of diacritic marks known as Harakat or Tashkil, those small vocalization marks lay above or bellow base glyphs, a well designed font must provide attachment points to define the spatial relation between diacritic mark and base glyph. As Arabic diacritics may be stacked above each other in relation to one base glyph, it is important to address mark to mark relations. |
||
− | Though we can develop our OpenType fonts with no use of Arabic presentation forms, |
||
− | we are going to use Arabic Presentation Forms B at least, because fontconfig |
||
− | checks for it to determine if a given font is Arabic capable or not. |
||
− | = |
+ | = Arabic in Unicode = |
⚫ | |||
− | Through out this tutorial we will use mainly fontforge as a font development tool. |
||
− | Fontforge is a free software application, runs on GNU/Linux and other similar Unix |
||
− | like systems (including Mac OS X), also runs on MS Windows systems under Cygwin. |
||
− | One of its the very handy features is its scripting interface, which allows |
||
− | writing small scripts (either in python or in fontforge's legacy scripting |
||
− | language) that automate tedious repeated tasks which helps greatly in minimizing |
||
− | the effort of font development. |
||
+ | Unicode defines two types of Arabic code points. Arabic block, [http://unicode.org/charts/PDF/U0600.pdf U+0600 - U+06FF], which contain almost all Arabic code points, and Arabic supplement, [http://unicode.org/charts/PDF/U0750.pdf U+0750 - U+077F], which contains some extra characters. There also Arabic presentation forms A and B, [http://unicode.org/charts/PDF/UFB50.pdf U+FB50 - U+FDFF] and [http://unicode.org/charts/PDF/UFE70.pdf U+FE70 - U+FEFF], Arabic presentation forms code different forms of Arabic letters and some ligatures, however, they are included for compatibility with legacy encodings and should never be used to directly encode the text. Presentation forms can be used when defining substitution tables, but that is not mandatory as you can added the contextual forms as unencoded glyphs to your font, and tell OpenType engine which glyph to use through the lookup tables. Note that, including the presentation forms in your font doesn't mean that they will be used, you must add lookup tables. |
||
⚫ | |||
− | Start fontforge with no arguments, it'll ask you for a font to open, just |
||
− | choose new, or pass <code>-new</code> option to it in the command line. Fontforge |
||
− | doesn't encode the new font as Unicode by default, go to Encoding -> Reencode and |
||
− | choose "ISO 10646-1 (Unicode, full)", this way Arabic Unicode points will be |
||
− | available for us to edit. |
||
⚫ | |||
⚫ | |||
+ | Through out this tutorial we will use mainly [http://fontforge.sourceforge.net FontForge] as a font editor. FontForge is capable for both editing glyph outlines, and OpenType table, and scriptable (using either python or its legacy scripting language). |
||
− | === Initial, medial and final forms === |
||
− | FE70-FEFF (and part of FB50-FDFF) points cover initial, medial, final and isolated forms of The above Arabic code points, refer to FIXME for description of each glyph. FEF5-FEFC are the various forms of lam-alef ligature, though the simplest way to support lam-alef is by using those ligatures, it isn't the only way and not necessarily the best way, we will discuss this in details later. |
||
+ | = First start = |
||
− | After creation of the above glyphs, we will need to to add OpenType lookup tables that define substitution rules for those glyphs, what we'll use here is single substitution tables of the type 'init', 'med' and 'fina'. |
||
+ | |||
+ | [[image:FF-Open Font.png|thumb|right|Open font dialog]] |
||
+ | |||
+ | Start fontforge with no arguments, it'll ask you for a font to open, just choose new, or pass <code>-new</code> option to it in the command line. FontForge doesn't encode the new font as Unicode by default, go to Encoding -> Reencode and choose "ISO 10646-1 (Unicode, BMP)", this way Arabic Unicode points will be available for us to edit. |
||
+ | |||
⚫ | |||
+ | |||
+ | = Contextual forms = |
||
+ | |||
+ | Arabic characters are either dual-joining, right-joining, non joining or transparent characters. Dual-joining characters has initial, medial, final and isolated forms, while right-joining don't have a medial form, other Arabic characters has only one form. |
||
+ | |||
+ | We need to provide glyphs for each form of the character, for OpenType fonts it is not necessary to encode those glyphs as presentation forms, we can just add them as unincoded glyphs as they will be accessed through GPOS table. However, if you want your font to be usable for legacy applications that rely on Arabic presentation forms for rendering contextual forms, you might consider adding those glyphs there. |
||
+ | |||
⚫ | |||
+ | |||
+ | == Special cases == |
||
⚫ | |||
+ | === Hamza === |
||
+ | |||
+ | = Lam-Alef = |
||
⚫ | |||
In modern Arabic typography, lam-alef is considered the only mandatory ligature. |
In modern Arabic typography, lam-alef is considered the only mandatory ligature. |
||
− | There is basically |
+ | There is basically two ways to support lam-alef in your font; ligature substitution |
and contextual substitution. |
and contextual substitution. |
||
− | + | = Ligature substitution = |
|
+ | |||
Almost all font designers use ligature substitution because its simplicity, however it isn't always the best way. Many OpenType implementations doesn't support ligature caret table, giving us no control on where to insert the cursor between ligature components, which makes the experience of editing text with many ligatures not so pleasant. Here we will need simple ligature substitution table, 'liga', FIXME |
Almost all font designers use ligature substitution because its simplicity, however it isn't always the best way. Many OpenType implementations doesn't support ligature caret table, giving us no control on where to insert the cursor between ligature components, which makes the experience of editing text with many ligatures not so pleasant. Here we will need simple ligature substitution table, 'liga', FIXME |
||
− | + | = Contextual substitution = |
|
+ | |||
In my opinion, contextual substitution is superior to ligature substitution, it |
In my opinion, contextual substitution is superior to ligature substitution, it |
||
decreases the number of unnecessary glyphs in your font, avoid all problems that |
decreases the number of unnecessary glyphs in your font, avoid all problems that |
المراجعة الحالية بتاريخ 02:06، 26 يناير 2017
<translate></translate>
Arabic is said to be complex script, compared to simple scripts like Latin and Cyrillic. To render Arabic text correctly we need more sophisticated font technologies, what is so called `smart fonts', like OpenType, AAT from Apple and
SIL's graphite.
محتويات
Features of Arabic script
Arabic is a cursive script, written from right to left. Each letter has, at least, 4 basic forms (with few exceptions); initial, middle, final and isolated forms, each of which might take different shapes according to the context and calligraphic style.
In OpenType fonts, there are lookup tables that define the various forms of each character, basically initial, medial and final forms simple substiution tables.
Another feature is ligatures (though the name ligature is misleading, as those ligatures are mandatory in Arabic, unlike Latin ligatures). Basically, the ligature is a special glyph that is composed of two or more glyphs. Take lam-alef as an example, it is a glyph that is composed of lam (initial or medial form) with alef (final form). However, if we looked carefully, we will see that lam-alef is no exception, just a special form of lam that it only takes when followed by alef, the same for the alef. In OpenType, lam-alef and like can be addressed either by using ligatures, or more complex contextual substitution.
Another feature of the Arabic script is the extensive use of diacritic marks known as Harakat or Tashkil, those small vocalization marks lay above or bellow base glyphs, a well designed font must provide attachment points to define the spatial relation between diacritic mark and base glyph. As Arabic diacritics may be stacked above each other in relation to one base glyph, it is important to address mark to mark relations.
Arabic in Unicode
Unicode defines two types of Arabic code points. Arabic block, U+0600 - U+06FF, which contain almost all Arabic code points, and Arabic supplement, U+0750 - U+077F, which contains some extra characters. There also Arabic presentation forms A and B, U+FB50 - U+FDFF and U+FE70 - U+FEFF, Arabic presentation forms code different forms of Arabic letters and some ligatures, however, they are included for compatibility with legacy encodings and should never be used to directly encode the text. Presentation forms can be used when defining substitution tables, but that is not mandatory as you can added the contextual forms as unencoded glyphs to your font, and tell OpenType engine which glyph to use through the lookup tables. Note that, including the presentation forms in your font doesn't mean that they will be used, you must add lookup tables.
Tools
Through out this tutorial we will use mainly FontForge as a font editor. FontForge is capable for both editing glyph outlines, and OpenType table, and scriptable (using either python or its legacy scripting language).
First start
Start fontforge with no arguments, it'll ask you for a font to open, just choose new, or pass -new
option to it in the command line. FontForge doesn't encode the new font as Unicode by default, go to Encoding -> Reencode and choose "ISO 10646-1 (Unicode, BMP)", this way Arabic Unicode points will be available for us to edit.
Basically, for a font to support Arabic language, we should cover U+0621 - U+0652 (Arabic letters) and U+0660 - U+0669 (Arabic digits; what Unicode calls "Arabic-Indic digits). Refer to Unicode charts for the details of each glyph.
Contextual forms
Arabic characters are either dual-joining, right-joining, non joining or transparent characters. Dual-joining characters has initial, medial, final and isolated forms, while right-joining don't have a medial form, other Arabic characters has only one form.
We need to provide glyphs for each form of the character, for OpenType fonts it is not necessary to encode those glyphs as presentation forms, we can just add them as unincoded glyphs as they will be accessed through GPOS table. However, if you want your font to be usable for legacy applications that rely on Arabic presentation forms for rendering contextual forms, you might consider adding those glyphs there.
Creating and populating tables
Special cases
Lam-alef
Hamza
Lam-Alef
In modern Arabic typography, lam-alef is considered the only mandatory ligature. There is basically two ways to support lam-alef in your font; ligature substitution and contextual substitution.
Ligature substitution
Almost all font designers use ligature substitution because its simplicity, however it isn't always the best way. Many OpenType implementations doesn't support ligature caret table, giving us no control on where to insert the cursor between ligature components, which makes the experience of editing text with many ligatures not so pleasant. Here we will need simple ligature substitution table, 'liga', FIXME
Contextual substitution
In my opinion, contextual substitution is superior to ligature substitution, it decreases the number of unnecessary glyphs in your font, avoid all problems that arise from using ligatures and make your font more close to the soul of Arabic calligraphy. But it needs careful analysis of Arabic calligraphy to define properly the start and end of each glyph.