FACTOID # 25: If you're tired of sitting in traffic on your way to work, move to North Dakota.
 
 Home   Encyclopedia   Statistics   States A-Z   Flags   Maps   FAQ   About 
   
 
WHAT'S NEW
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > List of XML and HTML character entity references

In SGML, HTML and XML documents, the logical constructs known as character data and attribute values consist of sequences of characters, in which each character can manifest directly (representing itself), or can be represented by a series of characters called a character reference, of which there are two types: a numeric character reference and a character entity reference. This article lists the character entity references that are valid in HTML and XML documents. The Standard Generalized Markup Language (SGML) is a metalanguage in which one can define markup languages for documents. ... HTML, an initialism of Hypertext Markup Language, is the predominant markup language for web pages. ... The Extensible Markup Language (XML) is a general-purpose markup language. ... A numeric character reference (NCR) is a common markup construct used in SGML and other SGML-based markup languages such as HTML and XML. It consists of a short sequence of characters that, in turn, represent a single character from the Universal Character Set (UCS) of Unicode. ... HTML has been in use since 1991 (note that the W3C international standard is now XHTML), but the first standardized version with a reasonably complete treatment of international characters was version 4. ...

Because of technical limitations, some web browsers may not display some special characters in this article.

Contents

An example of a Web browser (Mozilla Firefox) A web browser is a software application that enables a user to display and interact with text, images, videos, music and other information typically located on a Web page at a website on the World Wide Web or a local area network. ...

Character reference overview

A numeric character reference refers to a character by its Universal Character Set/Unicode code point, and uses the format The international standard ISO/IEC 10646 defines the Universal Character Set (UCS) as a character encoding. ... The Unicode Standard, Version 5. ...

&#nnnn;

or

&#xhhhh;

where nnnn is the code point in decimal form, and hhhh is the code point in hexadecimal form. The x must be lowercase in XML documents. The nnnn or hhhh may be any number of digits and may include leading zeros. The hhhh may mix uppercase and lowercase, though uppercase is the usual style. For other uses, see Decimal (disambiguation). ... In mathematics and computer science, hexadecimal, base-16, or simply hex, is a numeral system with a radix, or base, of 16, usually written using the symbols 0–9 and A–F, or a–f. ...


In contrast, a character entity reference refers to a character by the name of an entity which has the desired character as its replacement text. The entity must either be predefined (built-in to the markup language) or explicitly declared in a Document Type Definition (DTD). The format is the same as for any entity reference: An SGML entity is an abbreviation for part of a document. ... Document Type Definition (DTD), defined slightly differently within the XML and SGML (the language XML was derived from) specifications, is one of several SGML and XML schema languages, and is also the term used to describe a document or portion thereof that is authored in the DTD language. ...

&name;

where name is the name of the entity. The semicolon is required.


Predefined entities in XML

Despite the title of this Wikipedia topic, the XML specification does not use the term "character entity" or "character entity reference". The XML specification defines five "predefined entities" representing special characters, and requires that all XML processors honor them. The entities can be explicitly declared in a DTD, as well, but if this is done, the replacement text must be the same as the built-in definitions. XML also allows other named entities of any size to be defined on a per-document basis.


The table below lists the five XML predefined entities. The "Name" column mentions the entity's name. The "Character" column shows the character, if it is renderable. In order to render the character, the format &name; is used; for example, & renders as &. The "Unicode code point" column cites the character via standard UCS/Unicode "U+" notation, which shows the character's code point in hexadecimal. The decimal equivalent of the code point is then shown in parentheses. The "Standard" column indicates the first version of XML that includes the entity. The "Description" column cites the character via its canonical UCS/Unicode name, in English.

Name Character Unicode code point Standard Description
quot " U+0022 (34) XML 1.0 (double) quotation mark
amp & U+0026 (38) XML 1.0 ampersand
apos ' U+0027 (39) XML 1.0 apostrophe (= apostrophe-quote); only XHTML, see below
lt < U+003C (60) XML 1.0 less-than sign
gt > U+003E (62) XML 1.0 greater-than sign

Quotation marks or inverted commas (also called quotes and speech marks) are punctuation marks used in pairs to set off speech, a quotation, a phrase or a word. ... An ampersand (&), also commonly called an and sign is a logogram representing the conjunction and. ... For the prime symbol (′) used for feet and inches, see Prime (symbol). ... This article is about inequalities in mathematics. ... This article is about inequalities in mathematics. ...

Character entities in HTML

The HTML 4 DTD explicitly declares 252 character entities. HTML processors must honor the HTML DTD's declarations, even if the DTD is not mentioned in the HTML document. HTML does not allow other named entities to be defined.


HTML document authors who have been exposed to XML and XHTML often overlook the fact that the apos entity is not defined in HTML. rsquo is the best alternative in this case (but it is not the same character), or authors can use a numeric character reference instead (&#39; or &#x27; which always designate the character using the numeric value of its Unicode code point, independantly of the actual document encoding).


In the table below, the HTML built-in character entities are listed. The columns are as in the XML entity table, above, except "Standard" column indicates the first version of HTML that includes the entity. The version is one of the major releases of the HTML spec: 2.0, 3.2, or 4.0. HTML 4.01 didn't introduce any new entities.

Name Character Unicode code point Standard Description
quot " U+0022 (34) HTML 2.0 (double) quotation mark
amp & U+0026 (38) HTML 2.0 ampersand
apos ' U+0027 (39) XHTML 1.0 apostrophe (= apostrophe-quote); see below
lt < U+003C (60) HTML 2.0 less-than sign
gt > U+003E (62) HTML 2.0 greater-than sign
nbsp   U+00A0 (160) HTML 3.2 non-breaking space
iexcl ¡ U+00A1 (161) HTML 3.2 inverted exclamation mark
cent ¢ U+00A2 (162) HTML 3.2 cent sign
pound £ U+00A3 (163) HTML 3.2 pound sign
curren ¤ U+00A4 (164) HTML 3.2 currency sign
yen ¥ U+00A5 (165) HTML 3.2 yen sign
brvbar ¦ U+00A6 (166) HTML 3.2 broken bar
sect § U+00A7 (167) HTML 3.2 section sign
uml ¨ U+00A8 (168) HTML 3.2 diaeresis
copy © U+00A9 (169) HTML 3.2 copyright sign
ordf ª U+00AA (170) HTML 3.2 feminine ordinal indicator
laquo « U+00AB (171) HTML 3.2 left-pointing double angle quotation mark
not ¬ U+00AC (172) HTML 3.2 not sign
shy ­ U+00AD (173) HTML 3.2 soft hyphen
reg ® U+00AE (174) HTML 3.2 registered sign
macr ¯ U+00AF (175) HTML 3.2 macron
deg ° U+00B0 (176) HTML 3.2 degree sign
plusmn ± U+00B1 (177) HTML 3.2 plus-minus sign
sup2 ² U+00B2 (178) HTML 3.2 superscript two
sup3 ³ U+00B3 (179) HTML 3.2 superscript three
acute ´ U+00B4 (180) HTML 3.2 acute accent
micro µ U+00B5 (181) HTML 3.2 micro sign
para U+00B6 (182) HTML 3.2 pilcrow sign
middot · U+00B7 (183) HTML 3.2 middle dot
cedil ¸ U+00B8 (184) HTML 3.2 cedilla
sup1 ¹ U+00B9 (185) HTML 3.2 superscript one
ordm º U+00BA (186) HTML 3.2 masculine ordinal indicator
raquo  » U+00BB (187) HTML 3.2 right-pointing double angle quotation mark
frac14 ¼ U+00BC (188) HTML 3.2 vulgar fraction one quarter
frac12 ½ U+00BD (189) HTML 3.2 vulgar fraction one half
frac34 ¾ U+00BE (190) HTML 3.2 vulgar fraction three quarters
iquest ¿ U+00BF (191) HTML 3.2 inverted question mark
Agrave À U+00C0 (192) HTML 2.0 Latin capital letter a with grave
Aacute Á U+00C1 (193) HTML 2.0 Latin capital letter a with acute
Acirc  U+00C2 (194) HTML 2.0 Latin capital letter a with circumflex
Atilde à U+00C3 (195) HTML 2.0 Latin capital letter a with tilde
Auml Ä U+00C4 (196) HTML 2.0 Latin capital letter a with diaeresis
Aring Å U+00C5 (197) HTML 2.0 Latin capital letter a with ring above
AElig Æ U+00C6 (198) HTML 2.0 Latin capital letter ae
Ccedil Ç U+00C7 (199) HTML 2.0 Latin capital letter c with cedilla
Egrave È U+00C8 (200) HTML 2.0 Latin capital letter e with grave
Eacute É U+00C9 (201) HTML 2.0 Latin capital letter e with acute
Ecirc Ê U+00CA (202) HTML 2.0 Latin capital letter e with circumflex
Euml Ë U+00CB (203) HTML 2.0 Latin capital letter e with diaeresis
Igrave Ì U+00CC (204) HTML 2.0 Latin capital letter i with grave
Iacute Í U+00CD (205) HTML 2.0 Latin capital letter i with acute
Icirc Î U+00CE (206) HTML 2.0 Latin capital letter i with circumflex
Iuml Ï U+00CF (207) HTML 2.0 Latin capital letter i with diaeresis
ETH Ð U+00D0 (208) HTML 2.0 Latin capital letter eth
Ntilde Ñ U+00D1 (209) HTML 2.0 Latin capital letter n with tilde
Ograve Ò U+00D2 (210) HTML 2.0 Latin capital letter o with grave
Oacute Ó U+00D3 (211) HTML 2.0 Latin capital letter o with acute
Ocirc Ô U+00D4 (212) HTML 2.0 Latin capital letter o with circumflex
Otilde Õ U+00D5 (213) HTML 2.0 Latin capital letter o with tilde
Ouml Ö U+00D6 (214) HTML 2.0 Latin capital letter o with diaeresis
times × U+00D7 (215) HTML 3.2 multiplication sign
Oslash Ø U+00D8 (216) HTML 2.0 Latin capital letter o with stroke
Ugrave Ù U+00D9 (217) HTML 2.0 Latin capital letter u with grave
Uacute Ú U+00DA (218) HTML 2.0 Latin capital letter u with acute
Ucirc Û U+00DB (219) HTML 2.0 Latin capital letter u with circumflex
Uuml Ü U+00DC (220) HTML 2.0 Latin capital letter u with diaeresis
Yacute Ý U+00DD (221) HTML 2.0 Latin capital letter y with acute
THORN Þ U+00DE (222) HTML 2.0 Latin capital letter thorn
szlig ß U+00DF (223) HTML 2.0 Latin small letter sharp s (German Eszett)
agrave à U+00E0 (224) HTML 2.0 Latin small letter a with grave
aacute á U+00E1 (225) HTML 2.0 Latin small letter a with acute
acirc â U+00E2 (226) HTML 2.0 Latin small letter a with circumflex
atilde ã U+00E3 (227) HTML 2.0 Latin small letter a with tilde
auml ä U+00E4 (228) HTML 2.0 Latin small letter a with diaeresis
aring å U+00E5 (229) HTML 2.0 Latin small letter a with ring above
aelig æ U+00E6 (230) HTML 2.0 Latin lowercase ligature ae
ccedil ç U+00E7 (231) HTML 2.0 Latin small letter c with cedilla
egrave è U+00E8 (232) HTML 2.0 Latin small letter e with grave
eacute é U+00E9 (233) HTML 2.0 Latin small letter e with acute
ecirc ê U+00EA (234) HTML 2.0 Latin small letter e with circumflex
euml ë U+00EB (235) HTML 2.0 Latin small letter e with diaeresis
igrave ì U+00EC (236) HTML 2.0 Latin small letter i with grave
iacute í U+00ED (237) HTML 2.0 Latin small letter i with acute
icirc î U+00EE (238) HTML 2.0 Latin small letter i with circumflex
iuml ï U+00EF (239) HTML 2.0 Latin small letter i with diaeresis
eth ð U+00F0 (240) HTML 2.0 Latin small letter eth
ntilde ñ U+00F1 (241) HTML 2.0 Latin small letter n with tilde
ograve ò U+00F2 (242) HTML 2.0 Latin small letter o with grave
oacute ó U+00F3 (243) HTML 2.0 Latin small letter o with acute
ocirc ô U+00F4 (244) HTML 2.0 Latin small letter o with circumflex
otilde õ U+00F5 (245) HTML 2.0 Latin small letter o with tilde
ouml ö U+00F6 (246) HTML 2.0 Latin small letter o with diaeresis
divide ÷ U+00F7 (247) HTML 3.2 division sign
oslash ø U+00F8 (248) HTML 2.0 Latin small letter o with stroke
ugrave ù U+00F9 (249) HTML 2.0 Latin small letter u with grave
uacute ú U+00FA (250) HTML 2.0 Latin small letter u with acute
ucirc û U+00FB (251) HTML 2.0 Latin small letter u with circumflex
uuml ü U+00FC (252) HTML 2.0 Latin small letter u with diaeresis
yacute ý U+00FD (253) HTML 2.0 Latin small letter y with acute
thorn þ U+00FE (254) HTML 2.0 Latin small letter thorn
yuml ÿ U+00FF (255) HTML 2.0 Latin small letter y with diaeresis
OElig ΠU+0152 (338) HTML 4.0 Latin capital ligature oe
oelig œ U+0153 (339) HTML 4.0 Latin small ligature oe
Scaron Š U+0160 (352) HTML 4.0 Latin capital letter s with caron
scaron š U+0161 (353) HTML 4.0 Latin small letter s with caron
Yuml Ÿ U+0178 (376) HTML 4.0 Latin capital letter y with diaeresis
fnof ƒ U+0192 (402) HTML 4.0 Latin small letter f with hook
circ ˆ U+02C6 (710) HTML 4.0 modifier letter circumflex accent
tilde ˜ U+02DC (732) HTML 4.0 small tilde
Alpha Α U+0391 (913) HTML 4.0 Greek capital letter alpha
Beta Β U+0392 (914) HTML 4.0 Greek capital letter beta
Gamma Γ U+0393 (915) HTML 4.0 Greek capital letter gamma
Delta Δ U+0394 (916) HTML 4.0 Greek capital letter delta
Epsilon Ε U+0395 (917) HTML 4.0 Greek capital letter epsilon
Zeta Ζ U+0396 (918) HTML 4.0 Greek capital letter zeta
Eta Η U+0397 (919) HTML 4.0 Greek capital letter eta
Theta Θ U+0398 (920) HTML 4.0 Greek capital letter theta
Iota Ι U+0399 (921) HTML 4.0 Greek capital letter iota
Kappa Κ U+039A (922) HTML 4.0 Greek capital letter kappa
Lambda Λ U+039B (923) HTML 4.0 Greek capital letter lambda
Mu Μ U+039C (924) HTML 4.0 Greek capital letter mu
Nu Ν U+039D (925) HTML 4.0 Greek capital letter nu
Xi Ξ U+039E (926) HTML 4.0 Greek capital letter xi
Omicron Ο U+039F (927) HTML 4.0 Greek capital letter omicron
Pi Π U+03A0 (928) HTML 4.0 Greek capital letter pi
Rho Ρ U+03A1 (929) HTML 4.0 Greek capital letter rho
Sigma Σ U+03A3 (931) HTML 4.0 Greek capital letter sigma
Tau Τ U+03A4 (932) HTML 4.0 Greek capital letter tau
Upsilon Υ U+03A5 (933) HTML 4.0 Greek capital letter upsilon
Phi Φ U+03A6 (934) HTML 4.0 Greek capital letter phi
Chi Χ U+03A7 (935) HTML 4.0 Greek capital letter chi
Psi Ψ U+03A8 (936) HTML 4.0 Greek capital letter psi
Omega Ω U+03A9 (937) HTML 4.0 Greek capital letter omega
alpha α U+03B1 (945) HTML 4.0 Greek small letter alpha
beta β U+03B2 (946) HTML 4.0 Greek small letter beta
gamma γ U+03B3 (947) HTML 4.0 Greek small letter gamma
delta δ U+03B4 (948) HTML 4.0 Greek small letter delta
epsilon ε U+03B5 (949) HTML 4.0 Greek small letter epsilon
zeta ζ U+03B6 (950) HTML 4.0 Greek small letter zeta
eta η U+03B7 (951) HTML 4.0 Greek small letter eta
theta θ U+03B8 (952) HTML 4.0 Greek small letter theta
iota ι U+03B9 (953) HTML 4.0 Greek small letter iota
kappa κ U+03BA (954) HTML 4.0 Greek small letter kappa
lambda λ U+03BB (955) HTML 4.0 Greek small letter lambda
mu μ U+03BC (956) HTML 4.0 Greek small letter mu
nu ν U+03BD (957) HTML 4.0 Greek small letter nu
xi ξ U+03BE (958) HTML 4.0 Greek small letter xi
omicron ο U+03BF (959) HTML 4.0 Greek small letter omicron
pi π U+03C0 (960) HTML 4.0 Greek small letter pi
rho ρ U+03C1 (961) HTML 4.0 Greek small letter rho
sigmaf ς U+03C2 (962) HTML 4.0 Greek small letter final sigma
sigma σ U+03C3 (963) HTML 4.0 Greek small letter sigma
tau τ U+03C4 (964) HTML 4.0 Greek small letter tau
upsilon υ U+03C5 (965) HTML 4.0 Greek small letter upsilon
phi φ U+03C6 (966) HTML 4.0 Greek small letter phi
chi χ U+03C7 (967) HTML 4.0 Greek small letter chi
psi ψ U+03C8 (968) HTML 4.0 Greek small letter psi
omega ω U+03C9 (969) HTML 4.0 Greek small letter omega
thetasym ϑ U+03D1 (977) HTML 4.0 Greek theta symbol
upsih ϒ U+03D2 (978) HTML 4.0 Greek upsilon with hook symbol
piv ϖ U+03D6 (982) HTML 4.0 Greek pi symbol
ensp U+2002 (8194) HTML 4.0 en space [1]
emsp U+2003 (8195) HTML 4.0 em space [2]
thinsp U+2009 (8201) HTML 4.0 thin space [3]
zwnj U+200C (8204) HTML 4.0 zero width non-joiner
zwj U+200D (8205) HTML 4.0 zero width joiner
lrm U+200E (8206) HTML 4.0 left-to-right mark
rlm U+200F (8207) HTML 4.0 right-to-left mark
ndash U+2013 (8211) HTML 4.0 en dash
mdash U+2014 (8212) HTML 4.0 em dash
lsquo U+2018 (8216) HTML 4.0 left single quotation mark
rsquo U+2019 (8217) HTML 4.0 right single quotation mark
sbquo U+201A (8218) HTML 4.0 single low-9 quotation mark
ldquo U+201C (8220) HTML 4.0 left double quotation mark
rdquo U+201D (8221) HTML 4.0 right double quotation mark
bdquo U+201E (8222) HTML 4.0 double low-9 quotation mark
dagger U+2020 (8224) HTML 4.0 dagger
Dagger U+2021 (8225) HTML 4.0 double dagger
bull U+2022 (8226) HTML 4.0 bullet
hellip U+2026 (8230) HTML 4.0 horizontal ellipsis
permil U+2030 (8240) HTML 4.0 per mille sign
prime U+2032 (8242) HTML 4.0 prime
Prime U+2033 (8243) HTML 4.0 double prime
lsaquo U+2039 (8249) HTML 4.0 single left-pointing angle quotation mark
rsaquo U+203A (8250) HTML 4.0 single right-pointing angle quotation mark
oline U+203E (8254) HTML 4.0 overline
frasl U+2044 (8260) HTML 4.0 fraction slash
euro U+20AC (8364) HTML 4.0 euro sign
image U+2111 (8465) HTML 4.0 black-letter capital i
weierp U+2118 (8472) HTML 4.0 script capital p (Weierstrass p)
real U+211C (8476) HTML 4.0 black-letter capital r
trade U+2122 (8482) HTML 4.0 trademark sign
alefsym U+2135 (8501) HTML 4.0 alef symbol
larr U+2190 (8592) HTML 4.0 leftwards arrow
uarr U+2191 (8593) HTML 4.0 upwards arrow
rarr U+2192 (8594) HTML 4.0 rightwards arrow
darr U+2193 (8595) HTML 4.0 downwards arrow
harr U+2194 (8596) HTML 4.0 left right arrow
crarr U+21B5 (8629) HTML 4.0 downwards arrow with corner leftwards
lArr U+21D0 (8656) HTML 4.0 leftwards double arrow
uArr U+21D1 (8657) HTML 4.0 upwards double arrow
rArr U+21D2 (8658) HTML 4.0 rightwards double arrow
dArr U+21D3 (8659) HTML 4.0 downwards double arrow
hArr U+21D4 (8660) HTML 4.0 left right double arrow
forall U+2200 (8704) HTML 4.0 for all
part U+2202 (8706) HTML 4.0 partial differential
exist U+2203 (8707) HTML 4.0 there exists
empty U+2205 (8709) HTML 4.0 empty set
nabla U+2207 (8711) HTML 4.0 nabla
isin U+2208 (8712) HTML 4.0 element of
notin U+2209 (8713) HTML 4.0 not an element of
ni U+220B (8715) HTML 4.0 contains as member
prod U+220F (8719) HTML 4.0 n-ary product
sum U+2211 (8721) HTML 4.0 n-ary summation
minus U+2212 (8722) HTML 4.0 minus sign
lowast U+2217 (8727) HTML 4.0 asterisk operator
radic U+221A (8730) HTML 4.0 square root
prop U+221D (8733) HTML 4.0 proportional to
infin U+221E (8734) HTML 4.0 infinity
ang U+2220 (8736) HTML 4.0 angle
and U+2227 (8743) HTML 4.0 logical and
or U+2228 (8744) HTML 4.0 logical or
cap U+2229 (8745) HTML 4.0 intersection
cup U+222A (8746) HTML 4.0 union
int U+222B (8747) HTML 4.0 integral
there4 U+2234 (8756) HTML 4.0 therefore
sim U+223C (8764) HTML 4.0 tilde operator
cong U+2245 (8773) HTML 4.0 congruent to
asymp U+2248 (8776) HTML 4.0 almost equal to
ne U+2260 (8800) HTML 4.0 not equal to
equiv U+2261 (8801) HTML 4.0 identical to (equivalent to)
le U+2264 (8804) HTML 4.0 less-than or equal to
ge U+2265 (8805) HTML 4.0 greater-than or equal to
sub U+2282 (8834) HTML 4.0 subset of
sup U+2283 (8835) HTML 4.0 superset of
nsub U+2284 (8836) HTML 4.0 not a subset of
sube U+2286 (8838) HTML 4.0 subset of or equal to
supe U+2287 (8839) HTML 4.0 superset of or equal to
oplus U+2295 (8853) HTML 4.0 circled plus
otimes U+2297 (8855) HTML 4.0 circled times
perp U+22A5 (8869) HTML 4.0 up tack (perpendicular sign in math)
sdot U+22C5 (8901) HTML 4.0 dot operator
lceil U+2308 (8968) HTML 4.0 left ceiling
rceil U+2309 (8969) HTML 4.0 right ceiling
lfloor U+230A (8970) HTML 4.0 left floor
rfloor U+230B (8971) HTML 4.0 right floor
lang U+2329 (9001) HTML 4.0 left-pointing angle bracket
rang U+232A (9002) HTML 4.0 right-pointing angle bracket
loz U+25CA (9674) HTML 4.0 lozenge
spades U+2660 (9824) HTML 4.0 black spade suit
clubs U+2663 (9827) HTML 4.0 black club suit
hearts U+2665 (9829) HTML 4.0 black heart suit
diams U+2666 (9830) HTML 4.0 black diamond suit

^  A blue background has been used in order to display each space's width. This article or section does not cite its references or sources. ... The ß &#8212; Eszett [] in German or scharfes Es (sharp es) if spelled out &#8212; is a letter used only in the German alphabet. ... The left-to-right mark () is a non-printing character used in the computerized typesetting of bi-directional text containing mixed left-to-right scripts (such as English and Russian) and right-to-left scripts (such as Arabic and Hebrew). ... The right-to-left mark () is a non-printing character used in the computerized typesetting of bi-directional text containing mixed left-to-right scripts (such as English and Russian) and right-to-left scripts (such as Arabic and Hebrew). ... Karl Theodor Wilhelm Weierstraß (October 31, 1815 &#8211; February 19, 1897) was a German mathematician who is often cited as the father of modern analysis. (The letter ß may be transliterated as ss; one often writes Weierstrass. ... Fig. ...


Entities representing special characters in XHTML

The XHTML DTDs explicitly declare 252 entities whose expansion is a single character, which can therefore be informally referred to as "character entities". These have the same names and represent the same characters as the 252 character entities in HTML. Also, by virtue of being XML, XHTML documents may reference the predefined &apos; entity, which is not one of the 252 character entities in HTML. Additional entities of any size may be defined on a per-document basis. However, the usability of entity references in XHTML is affected by how the document is being processed: The Extensible HyperText Markup Language, or XHTML, is a markup language that has the same depth of expression as HTML, but also conforms to XML syntax. ...

  • If the document is read by a conforming HTML processor, then only the 252 HTML character entities can safely be used. The use of &apos; or custom entity references may not be supported and may produce unpredictable results.
  • If the document is read by an XML parser that does not or cannot read external entities, then only the five built-in XML character entities can safely be used, although other entities may be used if they are declared in the internal DTD subset.
  • If the document is read by an XML parser that does read external entities, then the five built-in XML character entities can safely be used. The other 248 HTML character entities can be used as long as the XHTML DTD is accessible to the parser at the time the document is read. Other entities may also be used if they are declared in the internal DTD subset.

Because of the special &apos; case mentioned above, only &quot;, &amp;, &lt;, and &gt; will work in all processing situations.


See also

HTML has been in use since 1991, but HTML 4. ... An SGML entity is an abbreviation for part of a document. ...

References

In computing, Unicode is the international standard whose goal is to provide the means to encode the text of every document people want to store in computers. ... It has been suggested that W3C Markup Validation Service be merged into this article or section. ...

External links


  Results from FactBites:
 
XML Topic Maps (XTM) 1.0 (7677 words)
It is a stable document and may be used as reference material or cited as a normative reference from another document.
XML Topic Maps (XTM) is a product of the TopicMaps.Org Authoring Group (AG), formed in 2000 by an independent consortium named TopicMaps.Org, originally chaired by Michel Biezunski and Steven R. Newcomb, and chaired at the date of delivery of this specification by Steve Pepper and Graham Moore.
Reference a published subject in a public ontology to establish the subject of which a topic is an instance:
  More results at FactBites »

 
 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments

Want to know more?
Search encyclopedia, statistics and forums:

 


Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms, 1022, m