This week's book giveaway is in the JavaScript forum.
We're giving away four copies of Cross-Platform Desktop Applications: Using Node, Electron, and NW.js and have Paul Jensen on-line!
See this thread for details.
Win a copy of Cross-Platform Desktop Applications: Using Node, Electron, and NW.js this week in the JavaScript forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

BufferedReader and the ] character  RSS feed

 
Tom Griffith
Ranch Hand
Posts: 275
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello. I am parsing an xml file and when I read (via in.readLine()) the following line:

<tag>value]</tag>

the line breaks at the ], so the String representation for the line is <tag>value]

then the next line becomes the closing tag:

</tag>

I also noticed this does not happen for all instances where a ] appears in the value...i tested an xml file where the value contained ] for five seperate records and when parsing the file with BufferedReader, the above phenomena occured twice, while the line read completely...<tag>value]</tag>...three times.

This has to be built with jvm 1.3, otherwise, i could use the DOM for 1.4 and above to do this. Thank you very much for reading this and for any input or ideas as to why this occurs.
[ January 17, 2007: Message edited by: Tom Griffith ]
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24217
38
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Buffered reader simply doesn't do this; readLine() breaks at line ending character sequences. Are you sure the data doesn't include extra embedded ^M or ^J characters? Can you open the file in a hex editor to check?
 
Tom Griffith
Ranch Hand
Posts: 275
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I thought it was really strange too. The xml is published on the web...here are some actual tags that i copied and pasted from the source...

these are problematic...the line cuts off after the ]...

<page>72 FR 67]</page>
<page>72 FR 98]</page>

these line reads completely...

<page>72 FR 141]</page>
<page>72 FR 143]</page>
<page>72 FR 177]</page>

I'm not sure what a hex reader is but i can find out. I do see a trend though...the bufferedreader appears to cut off all entries with a two digit number before the ]...<page>72 FR 67]...while entries with three digit numbers are read completely as one line...<page>72 FR 143]</page>. Really weird. I'm going to try to copy and paste all these in a local text file and see what a bufferedreader does with it. That will probably help to determine if anything funky is lurking in the published xml.
[ January 17, 2007: Message edited by: Tom Griffith ]
 
Tom Griffith
Ranch Hand
Posts: 275
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi. I copied and pasted the entire published xml to a local text file and bufferedreader had no problems reading every line...it never cuts off after the ] sign on any entry...pretty much as you said. This is weird. I guess i'm going to see if i can find out what a hex reader is and read the xml on the web with it.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!