Win a copy of Functional Reactive Programming this week in the Other Languages forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Handling entity references by XML parsers

 
Dan Drillich
Ranch Hand
Posts: 1183
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Good Day,

My beloved book "XML in a nutshell" of O'Reilly says (on page #18) that XML defines five entity references -

- the less-than sign
- the ampersand
- the greater-than sign
- the straight, double quotation marks
- the apostrophe, or single quote

It says that these entity references & a m p ; and & l t ; are considered markup and when an application parses an XML document, it replaces this particular markup with the actual characters the entity reference refers to. It also says that in addition to these five predefined entity references, you can define others in the document type definition.

So my question is - does it mean that all other entity references in the XML document are left intact by the parsers?

Regards,
Dan
 
Paul Clapham
Sheriff
Posts: 21416
33
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
No. If a parser encounters an undeclared entity reference it will throw an exception.
 
Dan Drillich
Ranch Hand
Posts: 1183
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thank you Paul.

Right, but what about all the "standard" HTML Escape Sequences, such é - & eacute ; , ö - & ouml ; , ò - & ograve ; , ñ - & ntilde ; , etc. ?

Regards,
Dan
 
Dan Drillich
Ranch Hand
Posts: 1183
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Paul,

I guess you are absolutely right! I put one of these entities in a valid XML file and tried to open it with Firefox and IE. Both didn't do it. Firefox even said -

XML Parsing Error: undefined entity


Regards,
Dan
 
Paul Clapham
Sheriff
Posts: 21416
33
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yup. HTML is not an XML dialect. (Although XHTML is... you will notice that an XHTML document contains a DTD reference at the top.)
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic