First, I would like to express my thanks to doctoral student David Gelbartfor the good advice he gave to me about the phonological studies. Currently, I am still searching for free details of Naom Chomsky Research
- The main concept is an encoding format for the whole Arab word, which should reflect the morphological structure in Arabic. We are focusing on this before we discuss implementing a system to convert existing text data into this semantic format.
- We need to implement a new format for letter representation. The simple way is to use a 5-bit representation every letter, so the letters can be ordered alphabetically or in another convenient fashion. Unfortunately, the space needed for three letters is: 2^15=32768. This is too large considering that 5->6000 triple roots in Arabic (which is the space calculated by adding al-harakat short vowels to the 28 consonants).
- The best solution is to use the Arabic phonemes, meaning that certain patterns and grouping of letters are possible in Arabic, while others cannot. The letters jeem and qaaf, for example, cannot be grouped together. Moreover, such considerations mean that we must deal with letter substitutions as well.
- Arabic grammar focuses on the existence of certain letters together, but now the reasons this happens (i.e.: why, where, how, etc.).
The branch of linguistics that deals with rules and examines these reasons is called phonology.
- Using the principles of phonology, I think we can apply these rules and make standards.
The idea is to use phonological analysis to represent the letter as a set of vocal characteristics, which leads to all the possible phonemes in a language. (Tajweed touches on this idea: a letter either has such characteristics or it does not.) Once we understand these phonological rules, we can make algorithms and design the format to include them. Hopefully, we can also discover unknown, nonstandard rules.
- If we incorporate all of this, we should reach the final format which reflects the morphological and phonological structure of Arabic in digital relational architecture.
- In 1968, Noam Chomsky and Morris Halle published The Sound Pattern of English (SPE), the basis for Generative Phonology. In this view,
phonological representations (surface forms) are structures whose phonetic part is a sequence of phonemes which are made up of distinctive features. ... The features describe aspects of articulation and perception, are from a universally fixed set, and have the binary values + or -. Ordered phonological rules govern how this phonological representation (also called underlying representation) is transformed into the actual pronunciation (also called surface form.)
- So Chomsky and Halle described phonemes as being made up of binary
characteristics, which if I understand correctly is what you want to do also. Right?
- Trying looking for phonological studies of Arabic?
- It is not that easy to find phonology articles online. (^)
I hope you will find a little. Maybe you'll need to use a university library, in the end. The sci.lang Usenet group (http://groups.google.com/group/sci.lang/ ) might be of interest to you also.
- What we need is a guide to these studies without deep knowledge of the International Phonetic Alphabet (IPA) and the voice synthesis research.
- now I'm about to rebuild the Tarmeez bit-structure based on Mansour M. Alghamdi which called Analysis, Synthesis and Perception of Voicing in Arabic
- it's great to have any possible guidness or even another point of view