FACTOID # 26: Delaware is the latchkey kid capital of America, with 71.8% of households having both parents in the labor force.
 
 Home   Encyclopedia   Statistics   States A-Z   Flags   Maps   FAQ   About 
   
 
WHAT'S NEW
RELATED ARTICLES
People who viewed "IDNA" also viewed:
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > IDNA
Example of Arabic IDN
Example of Chinese IDN

An internationalized domain name (IDN) is an Internet domain name that (potentially) contains non-ASCII characters. Such domain names could contain letters with diacritics, as required by many European languages, or characters from non-Latin scripts such as Arabic or Chinese. However, the standard for domain names does not allow such characters, and much work has gone into finding a way around this, either by changing the standard, or by agreeing on a way to convert internationalized domain names into standard ASCII domain names while preserving the stability of the domain name system. Example of Arabic IDN This is a screenshot of copyrighted Macintosh computer software. ... Example of Arabic IDN This is a screenshot of copyrighted Macintosh computer software. ... Download high resolution version (3012x2005, 483 KB)Chinese IDN File history Legend: (cur) = this is the current file, (del) = delete this old version, (rev) = revert to this old version. ... Download high resolution version (3012x2005, 483 KB)Chinese IDN File history Legend: (cur) = this is the current file, (del) = delete this old version, (rev) = revert to this old version. ... A domain name is the unique name of a computer on the Internet that distinguishes it from the other systems on the network. ... There are 95 printable ASCII characters, numbered 32 to 126. ... A diacritical mark or accent mark is an additional mark added to a basic letter. ... The Arabic alphabet is the script used for writing the Arabic language. ...


IDN has, by the standards of the Internet, a long history; it was originally proposed in 1998. After much debate and many competing proposals, a system called Internationalizing Domain Names in Applications (IDNA) was adopted as the chosen standard, and is currently, as of 2005, in the process of being rolled out. 1998 is a common year starting on Thursday of the Gregorian calendar, and was designated the International Year of the Ocean. ... 2005 is a common year starting on Saturday of the Gregorian calendar. ...


In IDNA, the term internationalized domain name means specifically any domain name consisting only of labels to which the IDNA ToASCII algorithm can be successfully applied. ToASCII is based on the Punycode ASCII encoding of normalized (Nameprep) Unicode strings. Punycode, defined in RFC 3492, is a self-proclaimed Bootstring encoding of Unicode strings into the limited character set supported by the Domain Name System. ... Nameprep is the process of Unicode NFKC normalization, case-folding, mapping lookalike characters together, and elimination of restricted codepoints applied to text before it is suitable to represent a domain name, or other such canonical name. ... In computing, Unicode is the international standard whose goal is to provide the means to encode the text of every document people want to store in computers. ...

Contents

Internationalizing Domain Names in Applications

Internationalizing Domain Names in Applications (IDNA) is a mechanism defined in 2003 for handling internationalized domain names containing non-ASCII characters. Such domain names could not be handled by the existing DNS and name resolver infrastructure. Rather than redesigning the existing DNS infrastructure, it was decided that non-ASCII domain names should be converted to a suitable ASCII-based form by web browsers and other user applications; IDNA specifies how this conversion is to be done. 2003 is a common year starting on Wednesday of the Gregorian calendar. ... Example of Arabic IDN Example of Chinese IDN An internationalized domain name (IDN) is an Internet domain name that (potentially) contains non_ASCII characters. ... There are 95 printable ASCII characters, numbered 32 to 126. ... A domain name is the unique name of a computer on the Internet that distinguishes it from the other systems on the network. ... The Domain Name System or DNS is a system that stores information about hostnames and domain names in a kind of distributed database on networks, such as the Internet. ... A web browser is a software package that enables a user to display and interact with documents hosted by web servers. ...


IDNA was designed for maximum backward compatibility with the existing DNS system, which was designed for use with names using only a subset of the ASCII character set. In technology (especially computing), backward compatibility has several related but differing meanings: A system is backward compatible if it is compatible with earlier versions of itself, or sometimes other earlier systems, particularly systems it intends to supplant. ...


An IDNA-enabled application is able to convert between the restricted-ASCII and non-ASCII representations of a domain, using the ASCII form in cases where it is needed (such as for DNS lookup), but being able to present the more readable non-ASCII form to users. Applications that do not support IDNA will not be able to handle domain names with non-ASCII characters, but will still be able to access such domains if given the (usually rather cryptic) ASCII equivalent.


ICANN issued guidelines for the use of IDNA in June 2003, and it was already possible to register .jp domains using this system in July 2003. Several other top-level domain registries started accepting registrations in March 2004. ICANN is the Internet Corporation for Assigned Names and Numbers. ... A top-level domain (TLD) is the last part of which Internet domain names consist of. ... 2004 is a leap year starting on Thursday of the Gregorian calendar. ...


Mozilla 1.4, Netscape 7.1 and Opera 7.11 are among the first applications to support IDNA. Mozilla (a. ... The logo of Netscape Navigator, as well as of Netscape Communications Corporation. ... The foyer of Charles Garniers Opéra, Paris, opened 1875 Opera is an art form consisting of a dramatic stage performance set to music. ...


ToASCII and ToUnicode

The conversions between ASCII and non-ASCII forms of a domain name are accomplished by algorithms called ToASCII and ToUnicode. These algorithms are not applied to the domain name as a whole, but rather to individual labels. For example, if the domain name is www.example.com, then the labels are www, example and com, and ToASCII or ToUnicode would be applied to each of these three separately.


The details of these two algorithms are complex, and are specified in the RFCs linked at the end of this article. The following gives an overview of their behaviour. A Request for Comments (RFC) document is one of a series of numbered Internet informational documents and standards very widely followed by both commercial software and freeware in the Internet and Unix communities. ...


ToASCII leaves unchanged any ASCII label, but will fail if the label is unsuitable for DNS. If given a label containing at least one non-ASCII character, ToASCII will apply the Nameprep algorithm (which converts the label to lowercase and performs other normalization) and will then translate the result to ASCII using Punycode before prepending the 4-character string "xn--". This 4-character string is called the ACE prefix, where ACE means ASCII Compatible Encoding, and is used to distinguish Punycode-encoded labels from ordinary ASCII labels. Note that the ToASCII algorithm can fail in a number of ways; for example, the final string could exceed the 63-character limit for the DNS. A label on which ToASCII fails cannot be used in an internationalized domain name. Nameprep is the process of Unicode NFKC normalization, case-folding, mapping lookalike characters together, and elimination of restricted codepoints applied to text before it is suitable to represent a domain name, or other such canonical name. ... Punycode, defined in RFC 3492, is a self-proclaimed Bootstring encoding of Unicode strings into the limited character set supported by the Domain Name System. ...


ToUnicode reverses the action of ToASCII, stripping off the ACE prefix and applying the Punycode decode algorithm. It does not reverse the Nameprep processing, since that is merely a normalization and is by nature irreversible. Unlike ToASCII, ToUnicode always succeeds, because it simply returns the original string if decoding would fail. In particular, this means that ToUnicode has no effect on a string that does not begin with the ACE prefix.


Example of IDNA encoding

As an example of how IDNA works, suppose the domain to be encoded is Bücher.ch. This has two labels, Bücher and ch. The second label is pure ASCII, and so is left unchanged. The first label is processed by Nameprep to give bücher, and then by Punycode to give bcher-kva, and then has xn-- prepended to give xn--bcher-kva. The final domain suitable for use with the DNS is therefore xn--bcher-kva.ch.


Spoofing concerns

Because IDN allows websites to use full Unicode names, it also makes it much easier to create a spoofed web site that looks exactly like another, including domain name and security certificate, but in fact is controlled by someone attempting to steal private information. These spoofing attacks potentially open users up to phishing attacks. A spoofing attack, in computer security terms, refers to a situation in which one person or program is able to masquerade successfully as another. ... In computing, phishing is the act of attempting to fraudulently acquire sensitive information, such as passwords and credit card details, by masquerading as a trustworthy person or business with a real need for such information in a seemingly official electronic notification or message (most often an email, or an instant...


These attacks are not due to technical deficiencies in either the Unicode or IDNA specifications, but due to the fact that different characters in different languages can look the same, depending on the font used. For example, Unicode character U+0430, Cyrillic small letter a ("а"), can look identical to Unicode character U+0061, Latin small letter a, ("a") which is the lowercase "a" used in English. Technically, characters that look alike in this way are known as homographs.


Although a computer may display visually identical or very similar glyphs for two different characters, these differences are still significant (to the computer, but not the user) when locating the web sites or validating certificates. Thus, the user's assumption of a one-to-one correspondence between the visual appearance of a name, and the named entity, breaks down. A glyph is a carved figure or character, incised or in relief; a carved pictograph; hence, a pictograph representing a form originally adopted for sculpture, whether carved or painted. ...


For example, someone could register a domain name that appears identical to an existing domain but goes somewhere else. For example, the spoofed domain "pаypal.com" contains a Cyrillic a, not a Latin a. In many ways, this is not a new thing. Even staying within the old character set of A-Z, 0-9 and hyphen, G00GLE.COM is easily confused with GOOGLE.COM, for example. What was new was that the expansion of the character repertoire from a few dozen characters in a single alphabet to many thousands of characters in many scripts greatly increased the scope for homograph attacks. In general, this kind of attack is known as a homograph spoofing attack. A domain name is the unique name of a computer on the Internet that distinguishes it from the other systems on the network. ... In multilingual computer systems, different logical characters may have identical or very similar appearances. ...


On February 7, 2005, Slashdot reported that this exploit was disclosed at the hacker conference Schmoocon with an example available at http://www.shmoo.com/idn/. On browsers supporting IDNA, the URL "https://www.pаypal.com/" appears to lead to paypal.com but instead leads to a spoofed PayPal web site that says "Meeow." Mozilla Firefox, which supports IDNA, shows the page as being at the paypal.com and with a verified security certificate. Firefox displays no warnings of any sort. February 7 is the 38th day of the year in the Gregorian Calendar. ... 2005 is a common year starting on Saturday of the Gregorian calendar and is the current year. ... Slashdot (frequently abbreviated as /.) is a popular technology-related website, updated many times daily with articles that are short summaries of stories on other websites with links to the stories, and provisions for readers to comment on the story. ... Hacker is a term used to describe different types of computer experts. ...


It is possible to work around this problem in Firefox, Mozilla and other Gecko-based browsers by turning off IDN support entirely. To do this, type "about:config" into the address bar, bringing up the list of browser settings. Then find the "network.enableIDN" setting, and change the value to "false". The browser will then report IDN URLs as nonexistent. Note that on some versions (particularly, Firefox 1.0), this work-around only works for the first session only. Closing the browser and restarting leaves the user vulnerable again (though the option remains disabled). This can be corrected by clearing the browser's cache. Gecko is the open-source web browser layout engine used in Mozilla, later Netscape releases and several other products. ...


On February 17, 2005, Mozilla developers announced that they would ship their next versions of their software with IDN support still enabled, but showing the punycode URLs instead, thus thwarting any attacks while still allowing people to access websites on an IDN domain. This is a change from the earlier plans to disable IDN entirely for the time being. [1] (https://bugzilla.mozilla.org/show_bug.cgi?id=279099#c135) February 17 is the 48th day of the year in the Gregorian calendar. ...


Since then, both Mozilla and Opera have now announced that they will be using per-domain whitelists to selectively switch on IDN display for domain run by registries which are taking appropriate anti-spoofing precautions. (See the article on homograph spoofing attacks for more details). In multilingual computer systems, different logical characters may have identical or very similar appearances. ...


History of IDN

  • 07/98: Asia Pacific Networking Group (now known as APSTAR) iDNS Working Group formed - chaired by James Seng
  • 1999: Early Research in IDN at National University of Singapore, Center for Internet Research
  • 02/99: iDNS Testbed launched with participation from CNNIC, JPNIC, KRNIC, TWNIC, THNIC, HKNIC and SGNIC
  • 11/99: IETF IDN Birds-of-Feather in Washington
  • 01/00: IETF IDN Working Group formed chaired by James Seng and Marc Blanchet
  • 03/01: ICANN Board IDN Working Group formed
  • 11/01: ICANN IDN Committee formed
  • 03/03: Publication of RFC 3454, RFC 3490, RFC 3491 and RFC 3492
  • 06/03: Publication of ICANN IDN Guidelines for registries (http://www.icann.org/general/idn-guidelines-20jun03.htm)
  • 05/04: Publication of RFC 3743, Joint Engineering Team (JET) Guidelines for Internationalized Domain Names (IDN) Registration and Administration for Chinese, Japanese, and Korean

Official Name: Seng Ching Hong Common Name: James Seng James Seng is the Assistant Director of Enabler Technologies at the Infocomm Development Authority of Singapore. ... The Japan Network Information Center (JPNIC) is the National Internet Registry in Japan that manages several aspects of internet operations, including the allocation of IP addresses and AS numbers. ... The Internet Engineering Task Force (IETF) is charged with developing and promoting Internet standards. ... The Internet Engineering Task Force (IETF) is charged with developing and promoting Internet standards. ... Official Name: Seng Ching Hong Common Name: James Seng James Seng is the Assistant Director of Enabler Technologies at the Infocomm Development Authority of Singapore. ... ICANN is the Internet Corporation for Assigned Names and Numbers. ... ICANN is the Internet Corporation for Assigned Names and Numbers. ...

DNS registries known to have adopted IDNA

.jp is the Internet country code top-level domain (ccTLD) for Japan. ... 2003 is a common year starting on Wednesday of the Gregorian calendar. ... .kr is the Internet country code top-level domain ( ccTLD) for South Korea. ... 2003 is a common year starting on Wednesday of the Gregorian calendar. ... .pl is the Internet country code top-level domain ( ccTLD) for Poland. ... September 11 is the 254th day of the year (255th in leap years). ... 2003 is a common year starting on Wednesday of the Gregorian calendar. ... .se is the Internet country code top-level domain ( ccTLD) for Sweden. ... 2003 is a common year starting on Wednesday of the Gregorian calendar. ... .dk is the country code top-level domain (ccTLD) for Denmark. ... January 2 is the 2nd day of the year in the Gregorian Calendar. ... 2004 is a leap year starting on Thursday of the Gregorian calendar. ... .museum is a generic top-level domain (gTLD) used exclusively by museums, museum associations, and individual members of the museum profession, as these groups are defined by the International Council of Museums (ICOM). ... January 20 is the 20th day of the year in the Gregorian Calendar. ... 2004 is a leap year starting on Thursday of the Gregorian calendar. ... .no is the Internet country code top-level domain ( ccTLD) for Norway. ... February 9 is the 40th day of the year in the Gregorian Calendar. ... 2004 is a leap year starting on Thursday of the Gregorian calendar. ... .de is the country code top-level domain (ccTLD) for the Federal Republic of Germany. ... March 1 is the 60th day of the year in the Gregorian calendar (61st in leap years). ... 2004 is a leap year starting on Thursday of the Gregorian calendar. ... .at is the Internet country code top-level domain (ccTLD) for Austria. ... March 1 is the 60th day of the year in the Gregorian calendar (61st in leap years). ... 2004 is a leap year starting on Thursday of the Gregorian calendar. ... .ch is the Internet country code top-level domain (ccTLD) for Switzerland. ... March 1 is the 60th day of the year in the Gregorian calendar (61st in leap years). ... 2004 is a leap year starting on Thursday of the Gregorian calendar. ... .lv is the Internet country code top-level domain ( ccTLD) for Latvia. ... 2004 is a leap year starting on Thursday of the Gregorian calendar. ... .at is the Internet country code top-level domain (ccTLD) for Austria. ... 2004 is a leap year starting on Thursday of the Gregorian calendar. ... .info is a generic top-level domain intended for informative websites, although its use is not restricted. ... March 19 is the 78th day of the year in the Gregorian calendar (79th in leap years). ... 2004 is a leap year starting on Thursday of the Gregorian calendar. ... .org is a generic top-level domain (gTLD) used in the Internets Domain Name System. ... January 18 is the 18th day of the year in the Gregorian Calendar. ... 2005 is a common year starting on Saturday of the Gregorian calendar and is the current year. ... .br is the Internet country code top-level domain (ccTLD) for Brazil. ... May 9 is the 129th day of the year in the Gregorian Calendar (130th in leap years). ... 2005 is a common year starting on Saturday of the Gregorian calendar and is the current year. ...

External links


  Results from FactBites:
 
RFC 3490 (rfc3490) - Internationalizing Domain Names in Applications (IDNA) (6177 words)
IDNA uses the Unicode character repertoire, which avoids the significant delays that would be inherent in waiting for a different and specific character set be defined for IDN purposes by some other standards developing organization.
The IDNA notion of equivalence is an extension of that older notion.
IDNA specifies that all internationalized domain names served by DNS servers that cannot be represented directly in ASCII must use the ACE form produced by the ToASCII operation.
RFC3490 (6084 words)
IDNA uses the Unicode character repertoire, which avoids the significant delays that would be inherent in waiting for a different and specific character set be defined for IDN purposes by some other standards developing organization.
IDNA is applicable to all domain names in all domain name slots except where it is explicitly excluded.
IDNA specifies that all internationalized domain names served by DNS servers that cannot be represented directly in ASCII must use the ACE form produced by the ToASCII operation.
  More results at FactBites »

 
 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments

Want to know more?
Search encyclopedia, statistics and forums:

 


Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms, 1022, m