Chinese characters or Han characters (汉字/漢字) are used in the written forms of the Chinese language, and to varying degrees in the Japanese and Korean languages (though the latter only in South Korea). Use of Chinese characters has disappeared from the Vietnamese language — in which they were used until the 20th century — and from North Korea, where they have been completely replaced by Hangul.
Chinese characters are called hànzì in Mandarin Chinese, kanji in Japanese, hanja or hanmun in Korean, and hán tư (also used in the chu nom script) in Vietnamese. However, the last is considered an extremely sinified form and Chinese characters are normally called chữ nho (字儒). (Note that the morphemes are reversed as is common in Vietnamese borrowings from Chinese.)
In Chinese, a word or phrase (词/詞 cí) (a unit of meaning) is composed of one or more characters (字 zì), as in hànzì (汉字/漢字), which has two characters. As in all spoken Chinese, each Chinese character is read as a single syllabic unit.
Japanese, Korean, and Vietnamese are not linguistically related to Chinese, and in order to make Chinese characters work in those languages with radically different grammar, many adaptations had to be made. In many cases in these languages, characters different from those used in Chinese are used for words or ideas of the same meaning. Also, many similar characters with identical meanings are written with slight differences. One example is black, which is written as 黒 (kuro) in Japanese, but as 黑 (hēi) in Chinese. In the twentieth century, thousands of simplified characters were created or adopted in mainland China, creating a distinction between, for example, 汉 in modern standard Chinese, and 漢 in traditional characters still used outside of mainland China, for example, Taiwan and Hong Kong.
For these reasons, particularly in China and Japan, where Chinese characters are used most often, it is frequently necessary to distinguish between Chinese Chinese characters and Japanese Chinese characters (though in English the distinction can often be made well enough by using the respective words hanzi and kanji).
The earliest Chinese characters are the so called "oracle script" or (甲骨文) jiǎgǔwén during the Shang Dynasty, followed by the bronzeware script or (金文) jīnwén during the Zhou Dynasty. These scripts no longer serve as anything but a source for scholars.
The first script that is still in (restricted) use today is the "seal script" or 篆書[篆书] zhuànshū. It is the result of the efforts of the first emperor of China, Qin Shi Huang, in the standardization of the Chinese script. The Seal Script, as the name suggests, is now only used in artistic seals. Few people are still able to read the seal script, although the art of carving a traditional seal in the seal script remains alive in China today.
Scripts that are still used regularly for print are the "clerk script" or 隸書[隶书] lìshū, the "Wei Monumental" or 魏碑 wèibēi, the "Regular Script" or 楷書[楷书] kǎishū, the "Song Style" or 宋體[宋体] sòngtǐ (only in printing), and the "running script" or 行書[行书] xíngshū. Modern Chinese handwriting is usually modeled on the Running Script.
Finally, there is the "draft script" (also called "grass script"), or 草書[草书] cǎoshū. The draft script is an idealized calligraphic style, where characters are suggested rather than realized. Despite being cursive to the point where individual strokes are no longer differentiable, the draft script is highly revered for the beauty and freedom that it embodies. Many simplified Chinese characters are based on this style.
Main article: radical
Each character has a fundamental component, or radical (部首 Chinese: bù shǒu, Japanese: bushu, literally "initial portion"), and this design principle is used in Chinese dictionaries to logically order characters in sets.
Full characters are ordered according to their initial radical, which fall into roughly 200 types. Then these are subcategorised by their total number of strokes.
This principle of categorisation is exploited by everybody who must learn to write Chinese characters: the vast number of Chinese characters can be much more easily memorized if they are mentally broken down into their constituent radicals.
Chinese scholars classify Han characters in several groups. The first type, and the type most often associated with Chinese writing, are pictograms, which are pictorial representations of the morpheme represented. There are also ideograms that attempt to graphicalize abstract concepts, such as "up" (上) and "down" (下). However, these pictograms and ideograms take up but a small proportion of Chinese logograms.
Excerpt from a 1436 primer on Chinese characters
Most Chinese characters, however, are radical-radical compounds, in which each element (radical) of the character hints at the meaning, and radical-phonetic compounds, in which one component (the radical) indicates the kind of concept the character describes, and the other hints at the pronunciation. This last type accounts for the majority of Chinese logograms. Note that despite being called "compounds", these logograms are single entities in themselves; they are written so that they take up the same amount of space as any other logogram.
Note that due to the long period of language evolution, such component "hints" within characters are often useless and sometimes quite misleading in modern usage. This is particularly true in non-Chinese languages.
Classification has its own problems, as the origins of characters are often obscure. For example, the character for "East" (東; Chinese: dōng, Japanese: higashi), which combines the "tree" radical (木) and the "sun" radical (日), is usually considered a radical-radical compound. Though it appears to represent a sun rising through trees, and this is both an evocative image and a useful mnemonic, the origin and classification of the character are disputed among scholars. While some agree with the radical-radical classification, others see it as a unique character in and of itself — some claim it as being derived from an early pictograph of bundled sticks.
As another example, the character for "mother" (媽 in Chinese mā) consists of one component meaning "female" (女) and another one meaning "horse" (馬 mǎ). The first component denotes a female entity, whereas the second suggests the pronunciation by referring to the word for "horse." The reason that "horse" was chosen to represent mother may be that horses — in a historical context — were often used to represent "steadfastness". The majority of Chinese characters, like this example, have one component that suggests the meaning and another that suggests pronunciation. In many cases, even the component intended to suggest pronunciation has an abstract semantic relation to the idea expressed by the character. This is possible because the phonetic system of Chinese allows for many words to have the same pronunciation (homonymy), and because the consideration of phonetic similarity used in a character generally ignores its tone and the manner of articulation of its initial consonant (but not the place of articulation).
Chinese characters all take up the same amount of space. One of the easiest ways for beginners to ensure this is with a grid as guidance. In addition to strictness in the amount of space a character takes up, Chinese characters are written with very precise rules. The three most important rules are the strokes employed, the stroke placement, and the order with which they are written (see Stroke order). Most words can be written with just one stroke order, though some words also have variant stroke orders, which may result in different stroke counts. On a larger scale, Chinese text is traditionally written from top to bottom and then right to left, but it is more common today to see the same orientation as Western languages: going from left to right and then top to bottom. Most punctuation was adopted from Western ones, but there are a few exceptions: for example, names of books are marked with a wavy line drawn to their right in vertical text, or enclosed in a special double pointed bracket in horizontal text.
Common errors while writing Chinese characters include incorrect stroke direction, incorrect stroke order, incorrect stroke length relative to other strokes, and incorrect placement of strokes relative to other strokes. Each mistake is highly visible to the literate eye due to the imperfections of the human fingers, as well as the weight given to the different parts of a stroke. Mistakes are often shunned, as they are marks of illiteracy or incompetence. In a culture that values scholarship as its highest virtue, such attributions are highly undesirable. Because of this strictness in not only the image of the character, but how the image is produced, it is considered by many the most difficult to learn properly.
Due to the long history of China, as well as many stylistic variations that have developed and the many attempts by past rulers to standardize writing, some characters have multiple forms. The characters themselves can be considered separate, but often are merely derivatives of each other in that their composition is of the same root. They are often not considered simplifications, as their stroke count is sometimes the same, and often lessened only but a slight amount. The most famous today is probably the character for sword (劍), where the radical (on the right) is knife (刀). The same word can be written with different forms for the radical, including using 刃 or 刀 itself.
The usage of traditional characters versus simplified characters varies greatly, and can depend on both the local customs and the medium. Often, simplified characters would be used in everyday writing, or quick scribblings, while traditional characters would be used in printed works. However, the PRC's adoption of simplified characters has almost completely removed all traces of their traditional counterparts, save for in Hong Kong and Macau. There is no appropriate time or place to use either system, and often, it is what the target audience understands, as well as the upbringing of the writer. In addition there is a special system of characters used for writing numerals in financial contexts; these characters are deliberately chosen to be complicated, to prevent forgeries.
The design and use of a dictionary of Chinese characters presents interesting problems. Dozens of indexing schemes have been created for the Chinese characters. The great majority of these schemes — beloved by their inventors but nobody else — have appeared in only a single dictionary; only one such system has achieved truly widespread use. This is the system of radicals.
Chinese character dictionaries often allow users to locate entries in several different ways. Many Chinese, Japanese, and Korean dictionaries of Chinese characters list characters in radical order: characters are grouped together by radical, and radicals containing fewer strokes come before radicals containing more strokes. Under each radical, characters are listed by their total number of strokes. In Japanese and Korean dictionaries, it is usually possible to search for characters by sound, using Kana and Hangul. Most dictionaries also allow searches by total number of strokes, and individual dictionaries often allow other search methods as well.
For instance, to look up the character 松 (pine tree) in a typical dictionary, the user first determines which part of the character is the radical, then counts the number of strokes in the radical (in this case four), and turns to the radical index (usually located on the inside front or back cover of the dictionary). Under the number 4, the user locates the radical 木, then turns to the page number listed, which is the start of the listing of all the characters containing this radical. This page will have a sub-index giving stroke numbers and page numbers. The right half of the character also contains four strokes, so the user locates the number 4, and turns to the page number given. From there, the user must scan the entries to locate the character he or she is seeking. Some dictionaries have a sub-index which lists every character containing each radical, so that if the user knows the number of strokes in the non-radical portion of the character, he or she can locate the correct page number directly.
In Korean, character dictionaries are usually called Okpyeon (옥편; 玉篇), which literally means "Jewel Book", rather like the Latin word thesaurus ("treasure"). 玉篇 is also the name of a fourth-century Chinese dictionary from the Liang Dynasty.
Another popular dictionary system is the four corner method.
Most Chinese-English dictionaries and Chinese dictionaries sold to English speakers use the radical lookup method combined with an alphabetical listing of characters based on their pinyin romanization system. To use one of these dictionaries, the reader finds the radical and stroke number of the character, as before, and locates the character in the radical index. The character's entry will have the character's pronunciation in pinyin written down; the reader then turns to the main dictionary section and looks up the pinyin spelling alphabetically, just as if it were an English dictionary.
Derivatives of Han characters
Besides Korean and Japanese, a number of Asian languages used to be written with Han characters, or with characters modified from Han characters. They include:
- Vietnamese language (Chữ nôm)
- Khitan language  (http://ja.wikipedia.org/wiki/%E5%A5%91%E4%B8%B9%E6%96%87%E5%AD%97)
- Tangut language  (http://zh.wikipedia.org/wiki/%E8%A5%BF%E5%A4%8F%E6%96%87)  (http://www.cflac.org.cn/chinaartnews/2003-10/08/content_1024511.htm)  (http://www.huaxia.com/ssjn/smxx/00197002.html)
- Jurchen language  (http://ja.wikipedia.org/wiki/%E5%A5%B3%E7%9C%9F%E6%96%87%E5%AD%97)
- Zhuang language
- Miao language  (http://www.epochtimes.com/b5/4/4/17/n512718.htm)
- Yi language  (http://188.8.131.52:8080/d-library/newsite/resource/zhms/minzuwenzi/yiwen.htm)
- Mongolian language (Phags-pa)  (http://zh.wikipedia.org/wiki/%E5%85%AB%E6%80%9D%E5%B7%B4%E5%AD%97)  (http://www.omniglot.com/writing/phagspa.htm)  (http://www.cflac.org.cn/chinaartnews/2003-10/08/content_1024511.htm)
Number of Chinese characters
The question of how many characters there are is still the subject of debate. In the 18th century, European scholars claimed the total tally to be about 80,000. This number, however, is thought to be exaggerated as the character count varies by dictionary and its comprehensiveness. For example, the Kangxi Dictionary lists about 40,000 characters, while the modern Zhonghua Zihai lists in excess of 80,000. One reason for the overwhelming number of characters is due to the existence of rarely-occurring variant and obscure characters (many of which are unused, even in Classical Chinese). Note, however, that no two characters are ever contextually identical.
The large number of Chinese characters is due to its logographic nature — for every morpheme there must be a symbol, and sometimes there are variant characters have developed for the same morpheme. It has also been claimed that the sheer number of characters is used as a way to separate scholars from the ordinary, and perhaps even to keep certain texts from being read by but the most scholarly.
It is usually said that about 3,000 characters are needed for basic literacy in Chinese (for example, to read a Chinese newspaper), and a well-educated person will know well in excess of 4,000 to 5,000 characters. Note that it is not necessary to know a character for every known word of Chinese, as the majority of modern Chinese words are compounds made of two or more morphemes, and are thus written not with a single unique character, but with multiple, usually common, characters.
In Japan there are 1945 "daily use kanji" (常用漢字 jōyō kanji) designated by the Ministry of Education. These are taught during primary and secondary school. Publications which include characters which fall outside this list should print furigana or rubi over the characters as a phonetic guide.
There are also 2232 government-designated "name kanji" (jinmeiyō kanji 人名用漢字) used in personal and geographical names, with plans to increase this list by 578 kanji in the near future. This would be the largest increase since World War II. The plan has not been without controversy, however. For example, the Chinese characters for "cancer," "hemorrhoids," "corpse" and "excrement," as well as parts of compound words (words created from two or more Chinese characters) meaning "curse," "prostitute," and "rape," are among the proposed additions to the list. This is because no measures were taken to determine the appropriateness of the kanji proposed, with the committee deciding that parents could make such decisions themselves. However, the government will seek input from the public before approving the list. For further information, see the Names section of the main Kanji article. (There is also some speculation that the "odd" kanji being added to the names list are being done so in an attempt to make a de-facto expansion of the Jouyou Kanji List, rather than with the serious idea that anyone will use them in names. The idea of reducing the number of kanji in use has been a politically contentious issue, with many conservatives believing that kanji are culturally Japanese and that people should use them frequently.)
A well-educated Japanese person may know upwards of 3500 kanji. The Kanji kentei (日本漢字能力検定試験 Nihon kanji nōryoku kentei shiken or Test of Japanese Kanji Aptitude) tests the ability to read and write kanji. The highest level of the Kanji kentei tests the ability to read and write 6000 kanji, though in practice few people attain this level as Japanese generally uses fewer Chinese characters than Chinese does, and literacy in Japanese requires knowledge of fewer Chinese characters than literacy in Chinese.
In South Korea, middle and high school students learn 1,800 to 2,000 basic characters (Hanja), but most people use Hangul exclusively in their day-to-day lives. Chinese characters are still used to some extent, particularly in newspapers, weddings, place names and calligraphy.
Although nearly extinct, Vietnamese used varying scripts of Chinese characters to write the language, with use of Chinese characters becoming limited to ceremonial uses beginning in the 19th century. Similarly to Japan and Korea, Chinese was used by the ruling classes, and the characters were eventually adopted to write Vietnamese. To express native Vietnamese words which had different pronunciations than the Chinese, Vietnamese developed the Chu Nom script which added diacritical marks to distinguish native (Vietnamese) words from Chinese.
Often a character which is not commonly used (called "rare" or "variant" characters) will appear in a personal or place name in Chinese, Japanese, and Korean names (see Chinese name, Japanese name, and Korean name respectively). This has caused problems as many computer encoding systems include only the 5,000 or so most common characters and exclude the less often used characters. This is especially a problem for personal names which often contain rare or classical characters.
People who have run into this problem include Taiwanese politicians Wang Chien-shien (王建煊) and Yu Shyi-kun (游錫堃) and Taiwanese singer David Tao (陶喆). Newspapers have dealt with this problem in varying ways, including trying to create a character from two characters, including a picture, or, especially as is the case with Yu Shyi-kun, simply omitting the rare character with the hope that the reader will be able to infer who it refers to. Japanese newspapers may render such names and words in katakana instead of kanji, and it is common practice for people to write names for which they are unsure of the correct kanji in katakana instead.
- The Chinese Outpost: (http://www.chinese-outpost.com)Language learning site centered around an “Introduction to Mandarin Chinese” tutorial that aims to demystify the Chinese Language—in everyday language, not academese—with units focused on Pronunciation, Characters, and Grammar.
- A Typographic Outcry (http://www.landlubber.com/dec01/outcry.html): a curious perspective
- Chinese characters and culture (http://www.zhongwen.com)
- Chinese Character Dictionary (http://www.mandarintools.com/chardict.html): Look up simplified and traditional characters by English definition, pinyin, Cantonese, and radical/stroke.
- Generator for Chinese typographical filler text (http://www.lorem-ipsum.info/_chinese)
- If English was written like Chinese (http://www.zompist.com/yingzi/yingzi.htm)
- Chinese Symbols (http://www.chinese-school.netfirms.com/Chinese-symbols-customized.html) Introduction to Chinese Symbols
- Chinese (http://www.pinyin.info/readings/texts/visible/index.html): a selection about Chinese characters from Visible Speech: The Diverse Oneness of Writing Systems, by John DeFrancis