Granny's Programming Pearls
"inside of every large program is a small program struggling to get out"
JavaRanch.com/granny.jsp
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Couldn't convert HTML to XML file due to Java I/O issue

 
Jack Bush
Ranch Hand
Posts: 235
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi All,

I am having difficulty saving a complete ABC.html file (from certain website. e.g. www.abc.com) in time so that it could be used to convert to ABC.xml format (a combination of Saxon and TagSoup parser). This issue may have been due to the premature closing of ABC.html prior to reading is completed by XML conversion tools(light_html2xml or Saxon with TagSoup).



Note that ABC.html has been created successfully. However, the question is at what stage that it has been completely written and when was it being read by the conversion tool?

There is no problem with either of the conversion tools and I have used them extensively including in the former program. More importantly, this issue was posted to many different forums (http://forums.sun.com/thread.jspa?threadID=5343084, http://www.coderanch.com/t/129985/XML/Cannot-close-XML-file-used, http://www.stylusstudio.com/xmldev/200810/post40120.html, http://www.stylusstudio.com/xmldev/200810/post50120.html) but this symptom has indicated that it is an I/O issue as opposed to anything else.

This issue has plagued me for months where I thought the problem was from the XML conversion tool (light_html2xml or Saxon with TagSoup).

I have exhausted every effort but could not find a solution still.

The above programs are running on JDK 1.6.0_06, Netbeans 6.1, JDom 1.1, Saxon 6.5.5, TagSoup 1.2 on Windows XP.

This question has also been posted on http://forums.sun.com/thread.jspa?threadID=5346755
Any assistance would be much appreciated.

Many thanks,

Jack
 
Ulf Dittmer
Rancher
Posts: 42970
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You should flush and close any stream that's related to writing a file, before you start reading from that file.
 
Jack Bush
Ranch Hand
Posts: 235
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Ulf,

You were spot on about this issue!

The reading of ABC.html is now working after having added the flush and closing upstream.

Thank you very much,

Jack
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!