Win a copy of TDD for a Shopping Website LiveProject this week in the Testing forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Paul Clapham
  • Ron McLeod
  • Jeanne Boyarsky
  • Tim Cooke
Sheriffs:
  • Liutauras Vilda
  • paul wheaton
  • Henry Wong
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
  • Frits Walraven
Bartenders:
  • Piet Souris
  • Himai Minh

Why is left angle-bracket not allowed in an XML attribute value?

 
Ranch Hand
Posts: 1056
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Why does XML not allow the "<" character in an attribute value? Since all attribute values are in quotes, the "<" character can't create any parsing ambiguity.
And if the "<" character is not allowed, why is the ">" character allowed? This is a well-formed XML file:

but this is not:
 
Ranch Hand
Posts: 1183
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I think the reasons are consistency and simplicity. As the specification says -

The ampersand character (&) and the left angle bracket (< may appear in their literal form only when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they must be escaped ...

 
Ranch Hand
Posts: 104
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
As Dan says, I think simplicity is a big factor. Having occasionally worked with large XML documents, that we parsed ourselves, treating them as a String - for example, when we just wanted to find the content of the first element after a certain tag - you didn't necessarily keep track of all the characters - you just looked for a < to delimit that next element. So even though, looking at the XML, it's easy to see that < is within quotes and therefore unambigious, I suspect that many parsers will not necessarily take account of those quotes when dealing with the < character. Some people prefer to use an escape character for > as well, although there's no reason to do so, apart from consistency. It may seem very bad practice to just look for the < character to find a new tag but sometimes we found it necessary for performance reasons or to identify specific information before we parsed the XML properly, such as the company or type of XML document we were dealing with.
Hope this helps,
Kathy
 
Ron Newman
Ranch Hand
Posts: 1056
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
OK, but if that's the case, why is > allowed in an attribute? Seems to me that a naive parser would look at
<tag attr="a>b">
and see just
<tag attr="a>
 
The harder I work, the luckier I get. -Sam Goldwyn So tiny. - this ad:
Free, earth friendly heat - from the CodeRanch trailboss
https://www.kickstarter.com/projects/paulwheaton/free-heat
reply
    Bookmark Topic Watch Topic
  • New Topic