This week's book giveaways are in the Cloud and AI/ML forums.
We're giving away four copies each of Cloud Native Patterns and Natural Language Processing and have the authors on-line!
See this thread and this one for details.
Win a copy of Cloud Native PatternsE this week in the Cloud forum
or Natural Language Processing in the AI/ML forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Devaka Cooray
  • Liutauras Vilda
  • Jeanne Boyarsky
  • Bear Bibeault
Sheriffs:
  • Paul Clapham
  • Knute Snortum
  • Rob Spoor
Saloon Keepers:
  • Tim Moores
  • Ron McLeod
  • Piet Souris
  • Stephan van Hulst
  • Carey Brown
Bartenders:
  • Tim Holloway
  • Frits Walraven
  • Ganesh Patekar

How to read xml file with DOCTYPE

 
Ranch Hand
Posts: 54
Netbeans IDE MySQL Database Java
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hey Folks,

I want to read a xml file(insert the data to a Database...) see below




To read this file I have this code:


and after run this code I will get this result
run:
C:\tmp\XML\13thstreet.de_2017-12-19.xml
XML Reader C:\tmp\XML\xmltv.dtd (Das System kann die angegebene Datei nicht finden) <- It means: The System can't found this file
BUILD SUCCESSFUL (total time: 0 seconds)

what should I can do to read this file or to delete this line with the doctype?

Cheers

Chris
 
Sheriff
Posts: 24594
55
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Chris Ernst wrote:what should I can do to read this file or to delete this line with the doctype?



You could use a text editor to remove the !DOCTYPE line from your XML document. On the other hand there is quite likely a reason for that line being there, so maybe it would be preferable to find that DTD file and put a copy of it in the same directory as the XML document.
 
Chris Ernst
Ranch Hand
Posts: 54
Netbeans IDE MySQL Database Java
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hey Paul,

yes that should be the easyest way, but (I forgot to told you) that I download the files from a server 7 files for each TV Broadcaster...
 
Bartender
Posts: 20915
127
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The DOCTYPE element is used for XML validation.

Most XML parsers I've worked with are validating by default and will get very annoyed if they cannot validate the XML.

There are several solutions, depending on what parser is being used.

1. Rip out the DOCTYPE by brute force prior to parsing it
2. Switch off validation (if your XML parser supports it)
3. Put stuff in your parsing code that dummies up the effect of the DOCTYPE.
4. Alter the DOCTYPE to be more complete (reference the official doctype's URL). A good system will cache this, a not-so-good one will fetch the DTD every time, experience latency delays and fail if the URL goes offline.


And 5. Download the DOCTYPE and put it someplace that the XML parser can find it. I believe the file you want is http://xmltv.cvs.sourceforge.net/viewvc/xmltv/xmltv/xmltv.dtd

Additional note 1: the Java XML Document facility allows plugging in an XML parser (it doesn't do the actual parsing itself), The default one is SAX.

Additional note 2: I recommend using OS-neutral syntax when coding filename paths in Java. That is: new File("C:/tmp/XML/13thstreet.de_2017-12-19.xml") as an example.

Not only does exploying practices like this make your code more portable (I used to develop on a Windows desktop for deployment to Solaris production servers), note that you don't need to escape backslashes, thus avoiding the annoyances that getting the wrong number of backslashes can cause.
 
Chris Ernst
Ranch Hand
Posts: 54
Netbeans IDE MySQL Database Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I gone now the easy way, I kill this line

for those how want to know how to do

see at StackOverflow

Tim and Paul Thanks !
 
Greenhorn
Posts: 6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Chris Ernst wrote:I gone now the easy way, I kill this line

for those how want to know how to do

see at StackOverflow

Tim and Paul Thanks !



Thanks for the link Tim. It's explained pretty well there.
 
Paul Clapham
Sheriff
Posts: 24594
55
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Chris Ernst wrote:yes that should be the easyest way, but (I forgot to told you) that I download the files from a server 7 files for each TV Broadcaster...



Perhaps you could parse the documents directly from the URL on the server; it's possible that the DTD would then be in the correct location relative to the document's URL and then the parser would be happy. But it's also possible that the DTD wouldn't be there.
 
Chris Ernst
Ranch Hand
Posts: 54
Netbeans IDE MySQL Database Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It is a *.gz file see here
 
Paul Clapham
Sheriff
Posts: 24594
55
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
So why aren't those people including the DTD file in that archive?
 
Chris Ernst
Ranch Hand
Posts: 54
Netbeans IDE MySQL Database Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I don t know...  the json file is *.js.gz and so bad nested that I chose the xml file...
 
Sheriff
Posts: 21773
103
Eclipse IDE Spring VI Editor Chrome Java Ubuntu Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Without proper validation you're vulnerable for XXE attacks, so it's always important to validate/control what is read by the XML parser.
 
Tim Holloway
Bartender
Posts: 20915
127
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Paul Clapham wrote:So why aren't those people including the DTD file in that archive?



Because it's an industry standard DTD and thus should be fetched from an authoritative source.

Where the XML file fails is that it lacks a URI that would help the XML software download the DTD - it assumes you've already installed it locally.
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!