• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

entity was referenced but not declared

 
alexandre saviano
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Greetings to all !

I am new to this forum and would be very grateful if anyone could help me solve a problem with HTML files that I have to translate. I have been searching on many forums, for many hours, during many days, but I still haven't found a solution.
This is regarding entities which are "referenced but not declared". I know the problem has been asked many times and I understand what this is about, for example replacing é by é but I am sure there is another way around and since I have to translate more than a hundred files, all containing french entities (é, è, à...) I cannot afford to search and replace all entities in every file, it would take me days to do that...

The files are encoded in UTF 8 and here are the lines I have been trying to add

<!ENTITY eacute "é" (& # 2 3 3)
<!ENTITY egrave "è" (& # 2 3 2)

but when I add these, I get the following error "The content of elements must consist of well-formed character data or markup"

And if I add a DOC Type before those lines, I get an error saying that DOC Type is not allowed in this document...

I would like to add or create a list of all those entities so I can validate my XML files without any errors, please help me out,

Thank you very much
 
Paul Clapham
Sheriff
Posts: 21322
32
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Could you show us a small example of one of these documents? Right now your post isn't entirely clear to me -- these documents are HTML documents and not XML documents, am I right?
 
alexandre saviano
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello,

Thank you for your reply.
Yes the documents are HTML documents.. I have tried to attach one but apparently there's no way to enclose a file with a .htm extension...
Another way is to declare the entities in an external DTD (or internal, which I have been trying to do...) but still cannot do it...
A dtd (list of entities to declare) can be a .txt. file?




 
Paul Clapham
Sheriff
Posts: 21322
32
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
But it looks like you're trying to use XML software to do the translation? Actually I don't see where you described what you were doing at all.

Anyway what I would suggest is to use an HTML parser, one which can read an HTML document into a DOM structure (that's org.w3.dom.Document preferably). Then serialize that DOM into XML.

I'm not sure why you must convert the HTML entities to XML character entities -- why can't you just convert them to the characters themselves? In other words instead of converting "&eacute;" to "&#233;" why not just convert it to "é"?
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic