• Post Reply Bookmark Topic Watch Topic
  • New Topic

How to read xml file with DOCTYPE  RSS feed

 
Ranch Hand
Posts: 50
Java MySQL Database Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hey Folks,

I want to read a xml file(insert the data to a Database...) see below




To read this file I have this code:


and after run this code I will get this result
run:
C:\tmp\XML\13thstreet.de_2017-12-19.xml
XML Reader C:\tmp\XML\xmltv.dtd (Das System kann die angegebene Datei nicht finden) <- It means: The System can't found this file
BUILD SUCCESSFUL (total time: 0 seconds)

what should I can do to read this file or to delete this line with the doctype?

Cheers

Chris
 
Sheriff
Posts: 22968
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Chris Ernst wrote:what should I can do to read this file or to delete this line with the doctype?



You could use a text editor to remove the !DOCTYPE line from your XML document. On the other hand there is quite likely a reason for that line being there, so maybe it would be preferable to find that DTD file and put a copy of it in the same directory as the XML document.
 
Chris Ernst
Ranch Hand
Posts: 50
Java MySQL Database Netbeans IDE
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hey Paul,

yes that should be the easyest way, but (I forgot to told you) that I download the files from a server 7 files for each TV Broadcaster...
 
Bartender
Posts: 18889
78
Android Eclipse IDE Linux
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The DOCTYPE element is used for XML validation.

Most XML parsers I've worked with are validating by default and will get very annoyed if they cannot validate the XML.

There are several solutions, depending on what parser is being used.

1. Rip out the DOCTYPE by brute force prior to parsing it
2. Switch off validation (if your XML parser supports it)
3. Put stuff in your parsing code that dummies up the effect of the DOCTYPE.
4. Alter the DOCTYPE to be more complete (reference the official doctype's URL). A good system will cache this, a not-so-good one will fetch the DTD every time, experience latency delays and fail if the URL goes offline.


And 5. Download the DOCTYPE and put it someplace that the XML parser can find it. I believe the file you want is http://xmltv.cvs.sourceforge.net/viewvc/xmltv/xmltv/xmltv.dtd

Additional note 1: the Java XML Document facility allows plugging in an XML parser (it doesn't do the actual parsing itself), The default one is SAX.

Additional note 2: I recommend using OS-neutral syntax when coding filename paths in Java. That is: new File("C:/tmp/XML/13thstreet.de_2017-12-19.xml") as an example.

Not only does exploying practices like this make your code more portable (I used to develop on a Windows desktop for deployment to Solaris production servers), note that you don't need to escape backslashes, thus avoiding the annoyances that getting the wrong number of backslashes can cause.
 
Chris Ernst
Ranch Hand
Posts: 50
Java MySQL Database Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I gone now the easy way, I kill this line

for those how want to know how to do

see at StackOverflow

Tim and Paul Thanks !
 
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Chris Ernst wrote:I gone now the easy way, I kill this line

for those how want to know how to do

see at StackOverflow

Tim and Paul Thanks !



Thanks for the link Tim. It's explained pretty well there.
 
Paul Clapham
Sheriff
Posts: 22968
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Chris Ernst wrote:yes that should be the easyest way, but (I forgot to told you) that I download the files from a server 7 files for each TV Broadcaster...



Perhaps you could parse the documents directly from the URL on the server; it's possible that the DTD would then be in the correct location relative to the document's URL and then the parser would be happy. But it's also possible that the DTD wouldn't be there.
 
Chris Ernst
Ranch Hand
Posts: 50
Java MySQL Database Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It is a *.gz file see here
 
Paul Clapham
Sheriff
Posts: 22968
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
So why aren't those people including the DTD file in that archive?
 
Chris Ernst
Ranch Hand
Posts: 50
Java MySQL Database Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I don t know...  the json file is *.js.gz and so bad nested that I chose the xml file...
 
Sheriff
Posts: 21185
87
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Without proper validation you're vulnerable for XXE attacks, so it's always important to validate/control what is read by the XML parser.
 
Tim Holloway
Bartender
Posts: 18889
78
Android Eclipse IDE Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Paul Clapham wrote:So why aren't those people including the DTD file in that archive?



Because it's an industry standard DTD and thus should be fetched from an authoritative source.

Where the XML file fails is that it lacks a URI that would help the XML software download the DTD - it assumes you've already installed it locally.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!