3. Setup Essentials

Before we go any further, let us first make sure we have all the very essentials for Arabic support. The remainder of this document will assume you have read and followed the instructions of the following sections.

3.1. Configure Kernel

We will not get into the details of compiling your kernel, as it is not within the scope of this document. You can either find out if your pre-compiled kernel is compiled with the following options (check with your distribution's documentation) or compile your own kernel.

# Parition Types
CONFIG_NLS=y
# Native Language Support
CONFIG_NLS_DEFAULT="UTF8"
CONFIG_NLS_CODEPAGE_864=y
CONFIG_NLS_ISO8859_6=y
CONFIG_NLS_UTF8=y
    

3.2. Set Locales

NOTE: This section is incomplete and may contain errors. Please do NOT use it just yet.

There are several environments that need to be defined in order for certain applications to function the way you expect them to, with regard to Arabic.

$ LC_CTYPE=ar_EG.UTF-8
$ CHARSET=ISO_8859-6
$ OUTPUT_CHARSET=UTF-8
$ LESSCHARSET='UTF-8'
$ LANG=ar_US
$ export LC_CTYPE CHARSET OUTPUT_CHARSET LESSCHARSET LANG
    

Please note that you can change the 'US' for the country code of wherever you are. So, if you are in Egypt you would put in 'ar_EG' (this applies to all of the above).

There are also other locale environment variables like LANGUAGE and LC_ALL. LC_ALL overrides all other LC_* variables. You can simply set it to 'UTF-8'.

$ export LC_ALL=ar_US.UTF-8[1]
    

3.3. Install Libraries

There are three main element in bringing Arabic support:

  • Bidirectional support: the ability for text to render in both directions (left-to-right) and (right-to-left) when intermixing an RTL language (e.g. Arabic) with a LTR language (e.g. English).

  • Shaping/Joining support: the ability to shape Arabic script accordingly (up to four possible shapes per letter depending on position in word).

  • UTF-8 support: the ability to support the Unicode Transformational Format.

3.3.1. Install FriBiDi

http://fribidi.sourceforge.net/

Perhaps the most popular and most important library to have in your arsenal. This library currently supports re-ordering in compliance with the Unicode TR#9. Applications such as mlterm and Pango either use it or parts of it.

Unfortunately as of the writing of this document, fribidi is yet to have shaping as a part of the entire library.

3.4. Patch 'less-378'

http://old.arabeyes.org/project.php?proj=patches

A patch was submitted to the author of less to incorporate it onto the main source code. In the meantime, this patch will fix the size an Arabic line takes on your terminal.

Download the patch from here: http://old.arabeyes.org/download/download/external/less/less_composing.patch.tgz.

And the source for 'less' from here: http://www.greenwoodsoftware.com/less/less-378.tar.gz

$ tar zxvf less-378.tar.gz
$ cd less-378
$ tar zxvf less_composing.patch.tgz
$ patch -b -p0 < less_composing.patch
$ ./configure
$ make && make install
    

Now you should have a fully functional 'less' which displays Arabic text with the proper screen width!

3.5. Use Arabic Filenames

Although the capability to have Arabic filenames are there, it is generally not advised. That is because currently, most applications will not know how to deal with it.

3.5.1. Read Arabic Filenames

In order for your file manager to read Arabic filenames properly, you need to specify the character set to be used. There are two environment settings you need to have, which you can either export (using bash) or add to your ~/.profile or ~/.bash_profile file.

$ export G_BROKEN_FILENAMES=1 [2]
      

3.5.2. Write Arabic Filenames

There are two possibly means (at least) to name files using Arabic characters. The first is via an application's GUI filemanager (e.g. dired within emacs). The second is via the more common ubiquitous command-line in cooperation with a UTF-8 enabled shell. The command-line method, of course, will have to be used in conjunction with an Arabic-enabled Xterminal or terminal emulator (such as mlterm or PuTTY ). The two shells that have been tested and used extensively are bash and tcsh.

bash version 3.0+ requires no special setup or instructions as it works flawlessly with Arabic UTF-8 filename. Simply compile, install and use.

tcsh is somewhat more picky and its Arabic (and UTF-8) support depends on its compile options and environment. In order to find out what options tcsh was compiled we'll need to probe its special 'version' variable while under its shell.

$ tcsh
$ set | grep version
      

If 'wide' is listed as part of the options, then UTF-8 support is available and Arabic filenames are possible pending the use of a UTF-8 locale (e.g. ar_IQ.UTF-8). If 'dspm' is listed, then the following special variable setting is required.

$ set dspmbyte=utf8
      

If neither option is listed, tcsh is unusable for this function and bash should be seriously considered instead.

3.5.3. Mount Windows Partition

In order for you to be able to read Arabic filenames from a Windows partition, you need to tell the mount command what character set to use. This is done by the following (assuming your Windows partition resides on /dev/hda3:

 
# mount -t auto /dev/hda3 /mnt/win/ -oiocharset=utf8 [3]
      

You can also make this permanent by adding it to your /etc/fstab file.

/dev/hda3  /mnt/win vfat  defaults,iocharset=utf8  0 0
      


[1] This is not recommended as it may have some unpredictable effects.

[2] Glib assumes that the filenames are in the locale encoding rather than in UTF-8

[3] Please note that using the 'oiocharset' option is reported to cause inconsistencies. Use at your own risk!