• Post Reply Bookmark Topic Watch Topic
  • New Topic

RandomAccessFile  RSS feed

 
Claude Cundiff
Ranch Hand
Posts: 78
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello everyone!

I have the following problem I'm trying to solve:
I have a .txt file of data that needs to be read to a mySQL database.
The way the data is structured, some "fields" are blank and others are filled with data. I figured a Random Access File would be ideal for this situation. I have never used this class before so I'm experimenting.

However, the following code is not giving what I expected. I think I understand why but I don't know how to fix it.I think my characters are "rolling over" past 128 or something like that...

SAMPLE INPUT
dkkd kdkd kdkdkdkdkdkdkd
3
1

Code


OUTPUT
?

Thanks in advance. This is such an awesome site
 
Ernest Friedman-Hill
author and iconoclast
Sheriff
Posts: 24217
38
Chrome Eclipse IDE Mac OS X
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Claude,

RandomAccessFile wants binary data, not text data. That means readChar() expects 2 bytes, the size of a Java char. But in an ASCII txt file, each char is just one byte. The ? is a placeholder for the two-byte Unicode character 0x646B that RandomAccessFile thinks it sees in your data. You really want to read the two separate bytes 0x64 (d) and 0x6B (k).

Anyway, RandomAccessFile is virtually never appropriate for text files. If you absolutely know that the file is broken into records all exactly the same number of bytes long, then you could use it, but by reading bytes instead of characters.

It's better to use, e.g., BufferedReader to read text files.
 
Claude Cundiff
Ranch Hand
Posts: 78
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Ernest,

The thing about this is that I know how to use BufferedReader and kind of thing. What I'm unsure about is reading something like:



I need to break this up for a database with fields Account, Cusip, Shares.

I know that every line is exactly 133 characters, so I thought about reading this into a char array.

...also, wouldn't it be nice if there was a random access file type thing for .txt files?

[ added code tags to preserve white space - Ilja ]
[ June 22, 2008: Message edited by: Ilja Preuss ]
 
Campbell Ritchie
Marshal
Posts: 56599
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It would be nice to have random access to text, but the idea of text is that it permits different lengths of entry. Your example with Cusip: and Cusip: 333 demonstrates that nicely. It would not work with a char[] at all well.

You are aware of the Scanner class which will read lines from a text file?
You will have to go through the Java Tutorials for regular expressions; you can split on "Account" "Cusip" and "shares" and get the intervening Strings. You are going to have to pass NULL for the Cusip column for account 43.
 
Campbell Ritchie
Marshal
Posts: 56599
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
. . . and you could even use the Scanner#nextInt and Scanner#next methods to find the titles for the columns. Only for Cusip: you are liable to get a NoSuchElementException or an InputMismatchException.
There are various ways to handle that Exception; simply passing NULL to the database might work, and might still allow access to the shares: token which follows.
 
Claude Cundiff
Ranch Hand
Posts: 78
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks Ritchie,

That actually does give me another route to look into. I'll check it out. I know how to make it work with an array - but you're suggestion may make it faster and more elegant.

Let me look into that tutorial, I may have some more questions later...
 
Brian Egge
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you have a line feed at the end of each line, I would use a BufferedReader and then parse each line with a regex. If there is no line feed, and you know that each line is 133 characters, then you could read each line into a char array with the InputStream#read(char cbuf[], int off, int len) method.

Once you have the line in a string, you can use a regex to parse out the fields using capturing groups.



You could then validate each field to determine whether to insert a null or the value. For most fields you could just check the trim().length(), but for fields like CUSIP, you could run a CUSIP validator.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!