Encyclopedia > Khmer language
Image:PhiesaKhmae.gif [pʰiːəsaː kʰmaːe]
Spoken in: Cambodia, Vietnam, Thailand, the People's Republic of China, USA, France, Australia
Total speakers: 15.7 to 21.6 million (2004)
  • Native speakers: 14.7 to 20.6 million
    • Cambodia: 12.1 million
    • Vietnam: 1,055,174[1]
    • Thailand: 1.2 million
    • USA: 190,000
    • France: ca. 50,000
    • Australia: 22,000
    • Canada: 16,500
  • 2nd language speakers: 1 million in Cambodia
Language family: Austro-Asiatic
  Eastern Mon-Khmer
Writing system: Khmer script (abugida
Official status
Official language of: Cambodia
Regulated by: no official regulation
Language codes
ISO 639-1: km
ISO 639-2: khm
ISO 639-3: either:
khm — Central Khmer
kxm — Northern Khmer
This page contains Indic text. Without rendering support you may see irregular vowel positioning and a lack of conjuncts. More...

Khmer (ភាសាខ្មែរ), or Cambodian, is the language of the Khmer people and the official language of Cambodia. One of the more prominent Austroasiatic languages, the language has been considerably influenced by Sanskrit and Pali, especially in the royal and religious registers, through the vehicles of Hinduism and Buddhism. As a result of geographic proximity, the Khmer language has affected, and also been affected by, Thai, Lao, Vietnamese and Cham which all form a sprachbund in peninsular Southeast Asia.[2] The Khmer people are the predominant ethnic group in Cambodia, accounting for approximately 90% of the 13. ... The Austroasiatic languages are a large language family of Southeast Asia and India. ... The Sanskrit language ( , for short ) is a classical language of India, a liturgical language of Hinduism, Buddhism, Sikhism, and Jainism, and one of the 23 official languages of India. ... Pāli is a Middle Indo-Aryan dialect or prakrit. ... In linguistics, a register is a subset of a language used for a particular purpose or in a particular social setting. ... Hinduism (known as in modern Indian languages[1]) is a religious tradition[2] that originated in the Indian subcontinent. ... A silhouette of a Buddha statue at Ayutthaya, Thailand. ... Cham is the language of the Cham people of Southeast Asia. ... A Sprachbund (German for language bond, also known as a linguistic area, convergence area, diffusion area) is a group of languages that have become similar in some way because of geographical proximity. ... Location of Southeast Asia Southeast Asia is a subregion of Asia. ...

Khmer differs from neighboring languages such as Thai, Lao and Vietnamese in that it is not a tonal language. It has three main dialects that are mutually intelligible: This article or section does not cite any references or sources. ...

  • Battambang (considered the standard)
  • Phnom Penh
  • Northern Khmer, also known as Khmer Surin, spoken by ethnic Khmer native to Northeast Thailand
  • Cardamom Khmer, an archaic form spoken by a small population in the Cardamom Mountains of western Cambodia.[3]


Northern Khmer, also called Khmer Surin, is the dialect of the Khmer language spoken by the Khmer native to the Thai provinces of Surin, Srisaket, Buriram and Roi Et as well as those that have migrated from this region into Cambodia. ... The Krâvanh Mountains, or literally Cardamom Mountains (Khmer regular script: , Chuor Phnom Krâvanh; Thai: เขาบรรทัด, Khao Banthat), is a mountain range in the south-west area of Cambodia, near the border with Thailand. ...


Linguistic study of the Khmer language divides its history into four periods.[4] Pre-Angkorian Khmer, the language after its divergence from Proto-Mon-Khmer until the ninth century, is only known from words and phrases in Sanskrit texts of the era. Old Khmer (or Angkorian Khmer) is the language as it was spoken in the Khmer Empire from the 9th century until the weakening of the empire sometime in the 13th century. Old Khmer is attested by many primary sources and has been studied in depth by a few scholars, most notably Saveros Pou, Phillip Jenner and Heinz-Jürgen Pinnow. Following the end of the Khmer Empire the language lost the standardizing influence of being the language of government and accordingly underwent a turbulent period of change in morphology, phonology and lexicon. The language of this transition period, from about the 14th to 18th centuries, is referred to as Middle Khmer and saw borrowing from Thai, Lao and, to a lesser extent, Vietnamese. The changes during this period are so profound that the rules of Modern Khmer can not be applied to correctly understand the Old Khmer. The language became recognizable as the Modern Khmer spoken today in the 19th century.[4] Linguistics is the scientific study of language, which can be theoretical or applied. ... This article or section does not cite any references or sources. ... For other uses, see Morphology. ... Phonology (Greek phonÄ“ = voice/sound and logos = word/speech), is a subfield of linguistics which studies the sound system of a specific language (or languages). ... Look up lexicon in Wiktionary, the free dictionary. ...

Khmer is classified as a member of the Eastern branch of the Mon-Khmer language family, itself a subdivision of the larger Austro-Asiatic language group, which has representatives in a large swath of land from Northeast India down through Southeast Asia to the Malay Peninsula and its islands. As such, its closest relatives are the languages of the Pearic, Bahnaric, and Katuic families spoken by the hill tribes of the region.[5] The Vietic languages have also been classified as belonging to this family. The Mon-Khmer languages are the autochthonous languages of Indo-China. ... Pearic Languages are a subgroup of the Mon-Khmer languages. ... The Bahnaric languages are a group of about thirty Mon-Khmer languages spoken by about 700,000 people in Vietnam, Cambodia, and Laos. ... The fifteen Katuic languages form a branch of the Austroasiatic languages spoken by about 1. ... The Vietic languages are a branch of the Austroasiatic language family. ...


As described by Huffman, modern standard Khmer has the following consonant and vowel phonemes.[6] The phonological system described here is the inventory of sounds of the spoken language, not how they are written in the Khmer alphabet. The quick brown fox jumps over the lazy dog translated into Khmer. ...


Labial Coronal Palatal Velar Glottal
Aspirated plosive
Unaspirated plosive p t c k ʔ
Implosive ɓ ɗ
Nasal m n ɲ ŋ
Liquid r l
Fricative s h
Approximant ʋ j

The consonants /f/, /ʃ/, /z/ and /ɡ/ may occasionally occur in foreign words from, for example, French and other recent introductions. These consonants do not appear in the chart above because they are not Khmer consonants per se and the sounds do not occur in any Khmer words. These non-native sounds are only heard by speakers familiar with the originating language and have no corresponding symbol in the Khmer script, although combinations of letters otherwise unpronounceable are used to represent these sounds when necessary. In the speech of those who are not bilingual, these sounds are approximated with natively occurring phonemes: Labials are consonants articulated either with both lips (bilabial articulation) or with the lower lip and the upper teeth (labiodental articulation). ... Coronal consonants are articulated with the flexible front part of the tongue. ... Palatal consonants are consonants articulated with the body of the tongue raised against the hard palate (the middle part of the roof of the mouth). ... Velars are consonants articulated with the back part of the tongue (the dorsum) against the soft palate (the back part of the roof of the mouth, known also as the velum). ... Glottal consonants are consonants articulated with the glottis. ... Implosive consonants are glottalic ingressive consonants, meaning that air is sucked into the mouth while pronouncing them rather than expelled out of the mouth via the lungs as in pulmonic consonants. ... A nasal consonant is produced when the velum—that fleshy part of the palate near the back—is lowered, allowing air to escape freely through the nose. ... Liquid consonants, or liquids, are approximant consonants that are not classified as semivowels (glides) because they do not correspond phonetically to specific vowels (in the way that, for example, the initial in English yes corresponds to ). The class of liquids can be divided into lateral liquids and rhotics. ... Fricatives (or spirants) are consonants produced by forcing air through a narrow channel made by placing two articulators close together. ... Approximants are speech sounds that could be regarded as intermediate between vowels and typical consonants. ... In human language, a phoneme is the theoretical representation of a sound. ...

Foreign Sound (IPA) Khmer Representation Khmer Approximation (IPA)
/ɡ/ ហ្គ /k/
/ʃ/ ហ្ស /s/
/f/ ហ្វ /h/ or /pʰ/
/z/ ហ្ស /s/

Vowel nuclei

Long vowels ɛː ɨː əː ɔː
Short vowels i e ɨ ə ɐ a u o
Long diphthongs ei ɐe ɨə əɨ ɐə ao ou ɔə
Short diphthongs eə̆ uə̆ oə̆

The precise number and the phonetic value of vowel nuclei vary from dialect to dialect.[4] Short and long vowels of equal quality are distinguished solely by duration.

Syllable structure

Khmer words are predominantly of one or two syllables. There are 85 possible clusters of two consonants at the beginning of syllables and two three-consonant clusters with phonetic alterations as shown below:

p ɓ t ɗ c k ʔ m n ɲ ŋ j l r s h ʋ
p pʰt- - pʰc pʰk- - pʰn- pʰɲ- pʰŋ- pʰj- pʰl- pr- ps-
t tʰp- tʰk- - tʰm- tʰn- tʰŋ- tʰj- tʰl- tr- tʰʋ
c cʰp- cʰk- - cʰm- cʰn- cʰŋ- cʰl- cr- cʰʋ-
k kʰp- kʰt- - kʰc - kʰm- kʰn- kʰɲ- - kʰj- kʰl- kr- ks- kʰʋ-
s sp- st- - sk- - sm- sn- - - sl- sr-
ʔ ʔʋ-
m mt- - mc - mʰn- mʰɲ- ml- mr- ms- mh-
l lp- lk- - lm- - lh- -

Syllables begin with one of these consonants or consonant clusters, followed by one of the vowel nuclei. When the vowel nucleus is short, there has to be a final consonant. /p/ /t/ /c/ /k/ /ʔ/ /m/ /n/ /ɲ/ /ŋ/ /l/ /h/ /j/ and /ʋ/, can exist in a syllable coda. /h/ and /ʋ/ become [ç] and [w] respectively. The most common word structure in Khmer is a full syllable as described above, preceded by an unstressed, “minor” syllable that has a consonant-vowel (CV) structure CV-, CrV-, CVN- or CrVN- (N is any nasal in the Khmer inventory). Words can also be made up of two full syllables. The vowel in these preceding syllables is usually reduced in conversation to [ə], however in careful or formal speech and in TV and radio, they are always clearly articulated.

Words with three or more syllables exist, particularly those pertaining to science, the arts, and religion. These words are loanwords, usually derived from Pali, Sanskrit, or more recently, French.


Main article: Khmer grammar

Khmer is generally a Subject Verb Object (SVO) language with prepositions.[7] Although primarily an isolating language, lexical derivation by means of prefixes and infixes is common.[8] Adjectives, demonstratives and numerals follow their noun: This article discusses the grammar of the Khmer language, focusing on the standard (Phnom Penh) dialect. ... In linguistic typology, subject-verb-object (SVO) is the sequence subject verb object in neutral expressions: Sam ate oranges. ... In grammar, a preposition is a word that establishes a relationship between an object (usually a noun phrase) and some other part of the sentence, often expressing a location in place or time. ... An analytic language (or isolating language) is a language in which the vast majority of morphemes are free morphemes and considered to be full-fledged words. By contrast, in a synthetic language, a word is composed of agglutinated or fused morphemes that denote its syntactic meanings. ... In linguistics, derivation is the process of creating new lexemes from other lexemes, for example, by adding a derivational affix. ... Prefix has meanings in linguistics, mathematics and computer science, and telecommunications. ... Infix has meanings in linguistics, mathematics and computer science, and chemistry. ...

ស្រីឡើនោះ /srəːj lʔɐː nuç/ (girl pretty that) = that pretty girl

The noun has no grammatical gender or singular/plural distinction. Plurality can be marked by postnominal particles, numerals, or by doubling the adjective, which can also serve to intensify the adjective: Post-nominal letters also called Post-nominal initials or Post-nominal titles are letters placed after the name of an individual to indicate that that individual holds a position, educational degree, accreditation, office, or honour. ...

ឆែ្កធំ /cʰkae tʰom/ (dog large) = large dog

ឆែ្កធំធំ /cʰkae tʰom tʰom/ (dog large large) = large dogs or a very large dog

ឆែ្កពីរ /cʰkae piː/ (dog two) = two dogs

Classifying particles for use between numerals and nouns exist although are not obligatory as in, for example, Thai. As is typical of most East Asian languages,[9] the verb does not inflect at all; tense and aspect can be shown by particles and adverbs or understood by context. Verbs are negated by putting "/min/", "/pum/" or "/ʔɐt/" before them and "/teː/" at the end of the sentence or clause. A classifier, in linguistics, is a word or morpheme used in some languages in certain contexts to indicate the word class of a noun. ...

ខ្ញុំជឿ /kʰɲom cɨə/ - I believe

ខ្ញុំមិនជឿទេ /kʰɲom min cɨə teː/ - I don't believe

Social registers

Khmer employs a system of registers in which the speaker must always be conscious of the social status of the person spoken to. The different registers, which include those used for common speech, polite speech, speaking to or about royals and speaking to or about monks, employ alternate verbs, names of body parts and pronouns. This results in what appears to foreigners as separate languages and, in fact, isolated villagers often are unsure how to speak with royals and royals raised completely within the court do not feel comfortable speaking the common register. Another result is that the pronominal system is complex and full of honorific variations. In linguistics, a register is a subset of a language used for a particular purpose or in a particular social setting. ...

As an example, the word for "to eat" used between intimates or in reference to animals is /siː/. Used in polite reference to commoners, it's /ɲam/. When used of those of higher social status, it's /pisa/ or /tɔtuəl tiən/. For monks the word is /cʰan/ and for royals, /saoj/.[2]


Main article: Khmer numerals

The numbers[8] are: Khmer numerals are the numerals used in the Khmer language of Cambodia. ...

0 សូន្យ (son) /soːu̯n/
1 មួយ (muŏy) /muːə̯j/
2 ពីរ (pi) /piː/
3 បី (bei) /ɓəj/
4 បួន (buŏn) /ɓuːə̯n/
5 ប្រាំ (prăm) /pram/
6 ប្រាំមូយ (prăm muŏy) /pram muːə̯j/
7 ប្រាំពីរ (prăm pi) /pram piː/ (also /pram pɨl/)
8 ប្រាំបី (prăm bei) /pram ɓəj/
9 ប្រាំបួន (prăm buŏn) /pram ɓuːə̯n/
10 ១០ ដប់ (dâp) /ɗɑp/
100 ១០០ មួយរយ (muŏy rôy) /muːə̯j rɔj/
1,000 ១០០០ មួយពាន់ (muŏy péan) /muːə̯j piːə̯n/
10,000 ១០០០០ មួយម៉ើន (muŏy mein) /muːə̯j məjn/
100,000 ១០០០០០ មួយសែន (muŏy sên) /muːə̯j saːe̯n/
1,000,000 ១០០០០០០ មួយលាន (muŏy léan) /muːə̯j liːə̯n/


Dialects are sometimes quite marked. Notable variations are found in speakers from Phnom Penh (the capital city), the rural Battambang area, the areas of Northeast Thailand adjacent to Cambodia such as Surin province, the Cardamom Mountains, and in southern Vietnam.[4] The dialects form a continuum running roughly north to south. The speech of Phnom Penh, considered the standard, is mutually intelligible with the others but a Khmer Krom speaker from Vietnam, for instance, may have great difficulty communicating with a Khmer native to Sisaket Province in Thailand. A dialect (from the Greek word διάλεκτος, dialektos) is a variety of a language characteristic of a particular group of the languages speakers. ... Phnom Penh (Khmer: ; official Romanization: Phnum Pénh; IPA: ) is the largest, most populous and capital city of Cambodia. ... Battambang (also Batdambang) is a province of Cambodia. ... Surin (Thai สุรินทร์) is one of the north-eastern provinces (changwat) of Thailand. ... A dialect continuum is a range of dialects spoken across a large geographical area, differing only slightly between areas that are geographically close, and gradually decreasing in mutual intelligibility as the distances become greater. ... The Khmer Krom (Khmer: ) are the indigenous ethnic Khmer minority living in southern Vietnam, especially in the Mekong River delta. ... Sisaket (Thai: ) is one of the north-eastern provinces (changwat) of Thailand. ...

Northern Khmer, the dialect spoken in Thailand, is referred to in Khmer as Khmer Surin and, although it only began divergence from standard Khmer within the last 200 years, is considered by some linguists to be a separate language. This is due to its distinct accent influenced by the surrounding tonal language, Thai, lexical differences and its phonemic differences in both vowels and distribution of consonants. Final "r", which has become silent in other dialects of Khmer, is pronounced in Northern Khmer.

Western Khmer, also called Cardamom Khmer, spoken by a small, isolated population in the Cardamom mountain range extending from Cambodia into Thailand, although little studied, is unique in that it maintains a definite system of vocal register that has all but disappeared in other dialects of modern Khmer.[4] In linguistics, a register language is a language which combines tone and vowel phonation into a single phonological system. ...

A notable characteristic of Phnom Penh casual speech is merging or complete elision of syllables, considered by speakers from other regions as a "relaxed" pronunciation. For instance, "Phnom Penh" will sometimes be shortened to "m'Penh". Another characteristic of Phnom Penh speech is observed in words with an "r" either as an initial consonant or as the second member of a consonant cluster (as in the English word "bread"). The "r", trilled or flapped in other dialects, is either pronounced as an uvular trill (similar to French) or not pronounced at all. This alters the quality of any preceding consonant causing a harder, more emphasized pronunciation. Another unique result is that the syllable is spoken with a low-rising or "dipping" tone much like the "hỏi" tone in Northern Vietnamese. For example, some people pronounce /trəj/ (meaning "fish") as /təj/, the "r" is dropped and the vowel begins by dipping much lower in tone than standard speech and then rises, effectively doubling its length. Another example is the word /riən/ ("study, learn"). It is pronounced /ʀiən/, with the "uvular r" and the same intonation described above.[10] In music, see elision (music). ... In linguistics, a consonant cluster is a group of consonants which have no intervening vowel. ... The alveolar trill is a type of consonantal sound, used in some spoken languages. ... The alveolar tap or flap is a type of consonantal sound, used in some spoken languages. ... The uvular trill is a type of consonantal sound, used in some spoken languages. ... It has been suggested that Tonal language be merged into this article or section. ...

Writing system

Main article: Khmer script
Example of Khmer script

Khmer is written with the Khmer script, an abugida developed from the Pallava script of India before the 7th century.[11] The Khmer script is similar in appearance and usage to both Thai and Lao which were based on the Khmer system.[11] Khmer numerals, which were inherited from Indian numerals, are used more widely than Hindu-Arabic numerals. The Khmer script is also used within Cambodia to transcribe hill tribe languages that have no writing system.[6] This article or section uses Khmer characters which may be rendered as boxes or other nonsensical symbols. ... Image File history File links No higher resolution available. ... This article or section uses Khmer characters which may be rendered as boxes or other nonsensical symbols. ... An inscription of Swampy Cree using Canadian Aboriginal Syllabics, an abugida developed by Christian missionaries for Aboriginal Canadian languages An abugida, alphasyllabary, or syllabics is a writing system in which consonant signs (graphemes) are inherently associated with a following vowel. ... It has been suggested that this article or section be merged with Tamil script. ... Khmer numerals are the numerals used in the Khmer language of Cambodia. ... Numerals sans-serif Arabic numerals, known formally as Hindu-Arabic numerals, and also as Indian numerals, Hindu numerals, Western Arabic numerals, European numerals, or Western numerals, are the most common symbolic representation of numbers around the world. ...

References and notes

  1. ^ Vietnam's estimated amount of Khmer speakers by Ethnologue.com in (1999)
  2. ^ a b David A. Smyth, Judith Margaret Jacob (1993). Cambodian Linguistics, Literature and History: Collected Articles. Routledge (UK). ISBN 0728602180. 
  3. ^ Nancy Joan Smith-Hefner (1999). Khmer American: Identity and Moral Education in a Diasporic Community. University of California. ISBN 0520213491. 
  4. ^ a b c d e Mon-Khmer Studies Paul Sidwell. Australian National University. Accessed February 23, 2007.
  5. ^ Shorto, Harry L. edited by Sidwell, Paul, Cooper, Doug and Bauer, Christian (2006). A Mon-Khmer comparative dictionary. Canberra: Australian National University. Pacific Linguistics. ISBN 0-85883-570-3
  6. ^ a b Huffman, Franklin. 1970. Cambodian System of Writing and Beginning Reader. Yale University Press. ISBN 0-300-01314-0
  7. ^ Huffman, Franklin. 1967. An outline of Cambodian Grammar. PhD thesis, Cornell University.
  8. ^ a b David Smyth (1995). Colloquial Cambodian: A Complete Language Course. Routledge (UK). ISBN 0415100062. 
  9. ^ East and Southeast Asian Languages: A First Look at Oxford University Press Online
  10. ^ William Allen A. Smalley (1994). Linguistic Diversity and National Unity: Language Ecology in Thailand. University of Chicago. ISBN 0226762882. 
  11. ^ a b Khmer Alphabet at Omniglot.com

Further reading

  • Ferlus, Michel. 1992. Essai de phonétique historique du khmer (Du milieu du premier millénaire de notre ère à l'époque actuelle)", Mon-Khmer Studies XXI: 57-89)
  • Headley, Robert et al. 1977. Cambodian-English Dictionary. Washington, Catholic University Press. ISBN 0813205093
  • Huffman, F. E., Promchan, C., & Lambert, C.-R. T. (1970). Modern spoken Cambodian. New Haven: Yale University Press. ISBN 0300013159
  • Huffman, F. E., Lambert, C.-R. T., & Im Proum. (1970). Cambodian system of writing and beginning reader with drills and glossary. Yale linguistic series. New Haven: Yale University Press. ISBN 0300011997
  • Jacob, Judith. 1974. A Concise Cambodian-English Dictionary. London, Oxford University Press. ISBN 0197135749
  • Jacob, J. M. (1996). The traditional literature of Cambodia: a preliminary guide. London oriental series, v. 40. New York: Oxford University Press. ISBN 0197136125
  • Jacob, J. M., & Smyth, D. (1993). Cambodian linguistics, literature and history: collected articles. London: School of Oriental and African Studies, University of London. ISBN 0728602180
  • Keesee, A. P. K. (1996). An English-spoken Khmer dictionary: with romanized writing system, usage, and indioms, and notes on Khmer speech and grammar. London: Kegan Paul International. ISBN 0710305141
  • Meechan, M. (1992). Register in Khmer the laryngeal specification of pharyngeal expansion. Ottawa: National Library of Canada = Bibliothèque nationale du Canada. ISBN 0315750162
  • Sak-Humphry, C. (2002). Communicating in Khmer: an interactive intermediate level Khmer course. Manoa, Hawai'i: Center for Southeast Asian Studies, School of Hawaiian, Asian and Pacific Studies, University of Hawai'i at Manoa. OCLC: 56840636
  • Smyth, D. (1995). Colloquial Cambodian: a complete language course. London: Routledge. ISBN 0415100062
  • Stewart, F., & May, S. (2004). In the shadow of Angkor: contemporary writing from Cambodia. Honolulu: University of Hawai'i Press. ISBN 0824828496
  • Tonkin, D. (1991). The Cambodian alphabet: how to write the Khmer language. Bangkok: Trasvin Publications. ISBN 9748867021

External links

Khmer language edition of Wikipedia, the free encyclopedia

