FACTOID # 1: Idaho produces more milk than Iowa, Indiana and Illinois combined.
 
 Home   Encyclopedia   Statistics   States A-Z   Flags   Maps   FAQ   About 
   
 
WHAT'S NEW
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Unicode collation algorithm

The Unicode collation algorithm provides a standard way to put names, words or strings of text in sequence according to the needs of a particular situation.


When used with the default Unicode collation element table (DUCET), this collation method is similar to the European ordering rules for strings in most European languages. In particular, for strings in the Latin alphabet, the ordering is the same as normal sorting order in English and similar languages, since it first looks only at letters stripped of any modifications or diacritical marks.


Note - this is complicated stuff and this description may be in error. It is better to look at the Unicode Technical Standard #10 (http://www.unicode.org/unicode/reports/tr10/) itself.


In addition to specifying a default sorting, UTS #10 also specify how tailorings are used to get any desired sorting behaviour for a locale.


An important open source implementation of UCA is included with the IBM International Components for Unicode, which also supports tailoring. You can see the effects of tailoring and a large number of language specific tailorings in the on-line ICU Locale Explorer.


See also

External references

  • Unicode Collation Algorithm (http://www.unicode.org/unicode/reports/tr10/): Unicode Technical Standard #10
  • Mimer SQL Unicode Collation Charts (http://developer.mimer.com/collations/charts/index.tml)
  • IBM International Components for Unicode (http://oss.software.ibm.com/icu/)
  • IBM ICU Locale Explorer (http://oss.software.ibm.com/cgi-bin/icu/lx)

  Results from FactBites:
 
Science Fair Projects - Unicode collation algorithm (344 words)
The Unicode collation algorithm provides a standard way to put names, words or strings of text in sequence according to the needs of a particular situation.
When used with the default Unicode collation element table (DUCET), this collation method is similar to the European ordering rules for strings in most European languages.
In particular, for strings in the Latin alphabet, the ordering is the same as normal sorting order in English and similar languages, since it first looks only at letters stripped of any modifications or diacritical marks.
Internet Application Protocol Collation Registry (DRAFT!!!) (6013 words)
A collation is a named function which takes two arbitrary length character strings (with the exception of the i;octetOctet Collation collation) as input and can be used to perform one or more of three basic comparison operations: equality test, substring match, and ordering test.
A collation specification MUST state which of the three basic functions are supported (equality, substring, ordering) and how to perform each of the supported functions on any two input character strings including empty strings (with the exception of the i;octetOctet Collation collation).
Collations must be deterministic, i.e.given a collation with a specific name, and any two fixed input strings, the result MUST be the same for the same operation.
  More results at FactBites »

 
 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments

Want to know more?
Search encyclopedia, statistics and forums:

 


Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms, 1022, m