Arabic transliteration

Different approaches and methods for the romanization of Arabic exist. They vary in the way that they address the inherent problems of rendering written and spoken Arabic in the Latin script. Examples of such problems are the symbols for Arabic phonemes that do not exist in English or other European languages; the means of representing the Arabic definite article, which is always spelled the same way in written Arabic but has numerous pronunciations in the spoken language depending on context; and the representation of short vowels (usually i u or e o, accounting for variations such as Muslim / Moslem or Mohammed / Muhammad / Mohamed ).


Romanization is often termed "transliteration", but this is not technically correct. Transliteration is the direct representation of foreign letters using Latin symbols, while most systems for romanizing Arabic are actually transcription systems, which represent the sound of the language. As an example, the above rendering munāẓarat al-ḥurūf al-ʻarabīyah of the Arabic: مناظرة الحروف العربية‎ is a transcription, indicating the pronunciation; an example transliteration would be mnaẓrḧ alḥrwf alʻrbyḧ.

Romanization standards and systems

This list is sorted chronologically. Bold face indicates column headlines as they appear in the table below.

  • IPA: International Phonetic Alphabet (1886)
  • BS 4280 (1968): Developed by the [2]
  • SATTS: One-to-one mapping to Latin Morse equivalents.
  • IGN System 1973 or Variant B of the Amended Beirut System, which conforms to French orthography and is preferred to the Variant A in French-speaking countries as in Maghreb and Lebanon [4]
  • DIN 31635 (1982): Developed by the Deutsches Institut für Normung (German Institute for Standardization).
  • ISO 233 (1984).
  • ArabTeX (since 1992) its "native" input is 7-bit ASCII: "has been modelled closely after the transliteration standards ISO/R 233 and DIN 31635"
  • ISO 233-2 (1993). Simplified transliteration.
  • Hans Wehr transliteration (1994): A modification to DIN 31635.
  • Bikdash Transliteration ([10].
  • SAS: Spanish Arabists School ([11]
  • US Intelligence Community (2003). A simplified adaptation of ALA-LC romanization, created specifically to standardize report writing.
  • Arabic chat alphabet: Not a system; listed here merely for completeness. In some situations, such as online communication, users need a way to enter Arabic text only with the keys immediately available on a keyboard. As an ad hoc solution, such letters can be replaced with Arabic numerals of similar appearance.

A (non-normative) table comparing romanizations using DIN 31635, ISO 233, ISO/R 233, UN, ALA-LC, and Encyclopædia of Islam systems is available here: [12].

Comparison table

Letter Unicode Name IPA UNGEGN ALA-LC Wehr 1 DIN ISO SAS -2 BATR ArabTeX chat 2
ء3 0621 hamzah Template:IPA link/core ʼ [note 4] ʾ ˈˌ ʾ ' e ' 2
ا 0627 alif ā ʾ ā aa aa / A a a/e/é
ب 0628 ʼ Template:IPA link/core b
ت 062A ʼ Template:IPA link/core t
ث 062B thāʼ Template:IPA link/core th ç c _t s/th
ج 062C jīm Template:IPA link/core~Template:IPA link/core~Template:IPA link/core j ǧ ŷ j j ^g j/g/dj
ح 062D ḥāʼ Template:IPA link/core H .h 7
خ 062E khāʼ Template:IPA link/core kh j x K _h kh/7'/5
د 062F dāl Template:IPA link/core d
ذ 0630 dhāl Template:IPA link/core dh đ z' _d z/dh/th
ر 0631 ʼ Template:IPA link/core r
ز 0632 zayn/zāy Template:IPA link/core z
س 0633 sīn Template:IPA link/core s
ش 0634 shīn Template:IPA link/core sh š x ^s sh/ch
ص 0635 ṣād Template:IPA link/core ş S .s s/9
ض 0636 ḍād Template:IPA link/core D .d d/9'
ط 0637 ṭāʼ Template:IPA link/core ţ T .t t/6
ظ 0638 ẓāʼ Template:IPA link/core~Template:IPA link/core đ̣ Z .z z/dh/6'
ع 0639 ʻayn Template:IPA link/core ʻ [note 4] ʿ ř E ` 3
غ 063A ghayn Template:IPA link/core gh ġ g ğ g .g gh/3'
ف5 0641 ʼ Template:IPA link/core f
ق5 0642 qāf Template:IPA link/core q 2/g/q/8
ك 0643 kāf Template:IPA link/core k
ل 0644 lām Template:IPA link/core l
م 0645 mīm Template:IPA link/core m
ن 0646 nūn Template:IPA link/core n
ه 0647 ʼ Template:IPA link/core h
و 0648 wāw Template:IPA link/core, w w; ū w; o w; uu w w; o; ou/u/oo
ي6 064A ʼ Template:IPA link/core, y y; ī y; e y; ii y y; i/ee; ei/ai
آ 0622 alif maddah ʔaː ā ā, ʼā ʾā ʾâ ā 'aa eaa 'A 2a/aa
ة 0629 ʼ marbūṭah a, at h; t —; t h; t —; t ŧ t' T a/e(h); et/at
ى6 0649 alif maqṣūrah y á ā à aaa _A a
ال alif lām (var.) al- ʾal al- al-; ál- Al- al- el
  • ^1 Hans Wehr transliteration does not capitalize the first letter at the beginning of sentences nor in proper names.
  • ^2 The chat table is only a demonstration and is based on the spoken varieties which vary considerably from Literary Arabic on which the IPA table and the rest of the transliterations are based.
  • ^3 Review hamzah for its various forms.
  • ^4 The original standard symbols for these schemes for transliterating hamzah and ʻayn is by Modifier letter apostropheʼ⟩ and Modifier letter turned commaʻ⟩, respectively. However, there is a common practice to instead use Right single quotation mark⟩ and Left single quotation mark ⟨⟩, respectively. The glottal stop (hamzah) in these romanizations isn't written word-initially.
  • ^5 Fāʼ and qāf are traditionally written in Northeastern Africa as ڢ‎ and ڧـ ـڧـ ـٯ‎, respectively, while the latter's dot is only added initially or medially.
  • ^6 In Egypt, Sudan, and sometimes in other regions, the standard form for final-yāʼ is only ى (without dots) in handwriting and print, for both final /-iː/ and final /-aː/. ى for the latter pronunciation, is called ألف لينة alif layyinah [ˈʔælef læjˈjenæ], 'flexible alif'.

Romanization issues

Any romanization system has to make a number of decisions which are dependent on its intended field of application.


One basic problem is that written Arabic is normally unvocalized; i.e., many of the vowels are not written out, and must be supplied by a reader familiar with the language. Hence unvocalized Arabic writing does not give a reader unfamiliar with the language sufficient information for accurate pronunciation. As a result, a pure transliteration, e.g., rendering قطر as qṭr, is meaningless to an untrained reader. For this reason, transcriptions are generally used that add vowels, e.g. qaṭar.

Transliteration vs. transcription

Most uses of romanization call for transcription rather than transliteration: Instead of transliterating each written letter, they try to reproduce the sound of the words according to the orthography rules of the target language: Qaṭar. This applies equally to scientific and popular applications. A pure transliteration, for example, would need to omit vowels (e.g. qṭr ), making the result difficult to interpret except for a subset of trained readers fluent in Arabic. Even if vowels are added, a transliteration system would still need to distinguish between multiple ways of spelling the same sound in the Arabic script, e.g. alif  ا vs. alif maqṣūrah ى for the sound /aː/ ā, and the six different ways (ء إ أ آ ؤ ئ) of writing the glottal stop (hamza, usually transcribed ʼ ). This sort of detail is unneeded and needlessly confusing, except in a very few situations (e.g., typesetting text in the Arabic script).

Most issues related to the romanization of Arabic are about transliterating vs. transcribing; others, about what should be romanized:

  • Some transliterations ignore assimilation of the definite article al- before the "sun letters", and may be easily misread by non-Arabic speakers. For instance, "the light" النور an-nūr would be more literally transliterated along the lines of alnūr. In the transcription an-nūr, a hyphen is added and the unpronounced // removed for the convenience of the uninformed non-Arabic speaker, who would otherwise pronounce an /l/, perhaps not understanding that /n/ in nūr is geminated. Alternatively, if the shaddah is not transliterated (since it is strictly not a letter), a strictly literal transliteration would be alnūr, which presents similar problems for the uninformed non-Arabic speaker.
  • A transliteration should render the "closed tāʼ " (tāʼ marbūṭah, ة) faithfully. Many transcriptions render the sound /a/ as a or ah and t when it denotes /at/.
    • ISO 233 has a unique symbol, .
  • "Restricted alif" (alif maqṣūrah, ى) should be transliterated with an acute accent, á, differentiating it from regular alif ا, but it is transcribed in many schemes like alif, ā, when it stands for /aː/.
  • Nunation: what is true elsewhere is also true for nunation: transliteration renders what is seen, transcription what is heard, when in the Arabic script, it is written with diacritics, not by letters, or omitted.

A transcription may reflect the language as spoken, typically rendering names, for example, by the people of Baghdad (Baghdad Arabic), or the official standard (Literary Arabic) as spoken by a preacher in the mosque or a TV news reader. A transcription is free to add phonological (such as vowels) or morphological (such as word boundaries) information. Transcriptions will also vary depending on the writing conventions of the target language; compare English Omar Khayyam with German Omar Chajjam, both for عمر خيام /ʕumar xajjaːm/, [ˈʕomɑr xæjˈjæːm] (unvocalized ʿmr ḫyām, vocalized ʻUmar Khayyām).

A transliteration is ideally fully reversible: a machine should be able to transliterate it back into Arabic. A transliteration can be considered as flawed for any one of the following reasons:

  • A "loose" transliteration is ambiguous, rendering several Arabic phonemes with an identical transliteration, or digraphs for a single phoneme (such as dh gh kh sh th rather than ḏ ġ ḫ š ṯ ) may be confused with two adjacent consonants—but this problem is resolved in the ALA-LC romanization system, where the prime symbol ʹ is used to separate two consonants when they do not form a digraph;[2] for example: أَكْرَمَتْها akramatʹhā ('she honored her'), in which the t and h are two distinct consonantal sounds.
  • Symbols representing phonemes may be considered too similar (e.g., ` and ' or ʿ and ʾ for ع ʻayn and hamzah);
  • ASCII transliterations using capital letters to disambiguate phonemes are easy to type, but may be considered unaesthetic.

A fully accurate transcription may not be necessary for native Arabic speakers, as they would be able to pronounce names and sentences correctly anyway, but it can be very useful for those not fully familiar with spoken Arabic and who are familiar with the Roman alphabet. An accurate transliteration serves as a valuable stepping stone for learning, pronouncing correctly, and distinguishing phonemes. It is a useful tool for anyone familiar with the sounds of Arabic but who are not fully conversant in the language.

One criticism is that a fully accurate system would require special learning that most do not have to actually pronounce names correctly, and that with a lack of a universal romanization system they will not be pronounced correctly by non-native speakers anyway. The precision will be lost if special characters are not replicated and if a reader is not familiar with Arabic pronunciation.


Examples in Literary Arabic:

Arabic خليفة كان له قصر إلى المملكة المغربية
Arabic with diacritics
(normally omitted)
خَلِيفَة كَانَ لَهُ قَصْر إِلَى الْمَمْلَكَة الْمَغْرِبِيَّة
IPA /xaliːfa kaːna lahu qasˤr/ /ʔila l mamlaka al maɣribijja/
DIN 31635 Ḫalīfah kāna lahu qaṣr ʾIlā l-mamlakah al-Maġribiyyah
Hans Wehr ḵalīfa kān lahu qaṣr ilā l-mamlaka al-maḡribīya
ALA-LC Khalīfah kāna lahu qaṣr Ilá al-mamlakah al-Maghribīyah
UNGEGN Khalyfah kana lahu qaşr Ily al-mamlakah al-maghribiyyah
BATR Kaliifat' kaana lahu qaSr ilaaa almamlakat' almagribiyyat'
ArabTeX _halyfaT kAna lahu il_A almamlakaT alma.gribiyyaT
English Khalifa had a palace To the kingdom of Morocco

A Google Chrome extension exists to romanize Arabic webpages.[3]


