I have the following problem I'm trying to solve: I have a .txt file of data that needs to be read to a mySQL database. The way the data is structured, some "fields" are blank and others are filled with data. I figured a Random Access File would be ideal for this situation. I have never used this class before so I'm experimenting.
However, the following code is not giving what I expected. I think I understand why but I don't know how to fix it.I think my characters are "rolling over" past 128 or something like that...
RandomAccessFile wants binary data, not text data. That means readChar() expects 2 bytes, the size of a Java char. But in an ASCII txt file, each char is just one byte. The ? is a placeholder for the two-byte Unicode character 0x646B that RandomAccessFile thinks it sees in your data. You really want to read the two separate bytes 0x64 (d) and 0x6B (k).
Anyway, RandomAccessFile is virtually never appropriate for text files. If you absolutely know that the file is broken into records all exactly the same number of bytes long, then you could use it, but by reading bytes instead of characters.
It's better to use, e.g., BufferedReader to read text files.
It would be nice to have random access to text, but the idea of text is that it permits different lengths of entry. Your example with Cusip: and Cusip: 333 demonstrates that nicely. It would not work with a char at all well.
You are aware of the Scanner class which will read lines from a text file? You will have to go through the Java Tutorials for regular expressions; you can split on "Account" "Cusip" and "shares" and get the intervening Strings. You are going to have to pass NULL for the Cusip column for account 43.
posted 10 years ago
. . . and you could even use the Scanner#nextInt and Scanner#next methods to find the titles for the columns. Only for Cusip: you are liable to get a NoSuchElementException or an InputMismatchException. There are various ways to handle that Exception; simply passing NULL to the database might work, and might still allow access to the shares: token which follows.
posted 10 years ago
That actually does give me another route to look into. I'll check it out. I know how to make it work with an array - but you're suggestion may make it faster and more elegant.
Let me look into that tutorial, I may have some more questions later...
If you have a line feed at the end of each line, I would use a BufferedReader and then parse each line with a regex. If there is no line feed, and you know that each line is 133 characters, then you could read each line into a char array with the InputStream#read(char cbuf, int off, int len) method.
Once you have the line in a string, you can use a regex to parse out the fields using capturing groups.
You could then validate each field to determine whether to insert a null or the value. For most fields you could just check the trim().length(), but for fields like CUSIP, you could run a CUSIP validator.
Wink, wink, nudge, nudge, say no more, it's a tiny ad: