Hello there,
I have a real problem. In the definition of the format of the data file in the URLyBird notes it says this:
All numeric values are stored in the header information use the formats of the DataInputStream and DataOutputStream classes. All text values, and all fields (which are text only), contain only 8 bit characters, null terminated if less than the maximum length for the field. The character encoding is 8 bit US ASCII. The poor English in first sentance is not a typo on my part - its an exact quote. I assume what they mean is:
All numeric values that are stored in the header information use the formats of the DataInputStream and DataOutputStream classes. Or even:
All numeric values stored in the header information use the of formats the DataInputStream and DataOutputStream classes. Either sentance would make sense. Although its not a big deal, it makes me doubt the accuracy of what they say elsewhere in the paragraph (indeed, the whole document). Especially where it says:
The character encoding is 8 bit US ASCII. I was going to use the constructor for
String which takes a byte array and a charset name to convert an array of bytes into the correct character string. However, the documentation for the constructor referred me to the Charset class for a list of allowed charsets. The only US ASCII one was:
US-ASCII - Seven-bit ASCII, a.k.a. ISO646-US, a.k.a. the Basic Latin block of the Unicode character set. Thats seven bit US-ASCII, not EIGHT bit US-ASCII. I assume if I use this charset to decode the bytes I'm going to end up with the wrong characters. Am I missing something here?