Win a copy of Murach's Python Programming this week in the Jython/Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

How to retrieve and store pointers/positions as numbers?  RSS feed

 
Maria Soderberg
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I will be creating an index consisting of two files:

File 1: Rows of alphabetized words.

File 2: Rows of three letter combinations and for each combination, a pointer to the first word that begins with this three letter combination in File 1.

As I run my program it will take ordered words from an input stream and write them to file 1. For every word with a new three-letter-beginning I will store the three letters in file 2 with a number format pointer to the correct position in file 1.

I'm planning on writing to file 1 with a BufferedStreamWriter, but how do I get my current position in file 1 so I can store it as a number in file 2?

It's a large amount of words, so my streams need to be buffered. Therefore, I CAN'T use Random Access Files.

Thank you,
Maria
 
Joe Ess
Bartender
Posts: 9406
12
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sounds like you are building a low-level database. Have a look at that example and see if it gives you any ideas.
You can buffer random access files, but unless you expect your word searches to be linear, I don't think you will see a great improvement with buffering. Every time you do a non-linear search you'll have to flush the buffer and read new data in. That can drag performance down. Take a look at Java Platform Performance. It describes how to go about measuring different IO strategies. What works for one situation may be completely inappropriate for a differention situation.
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You could build a wrapper or extend your buffered writer to keep track of how many bytes have been written so far, which gives the position of the next write. You'll have to be certain of the line end character(s) - either \n or \r\n - and count them, too. This is bringing on heavy deja vu ... I did something similar in Turbo Pascal for a help system index back when Hector was a pup.
 
Maria Soderberg
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Stan, I think counting bytes would be a good alternative for me, thankyou! But what do you mean by "building a wrapper"?

Joe, thanks for the link, it helped with understanding, but like I said, I can't use a RandomAccessFile (school assignment specifically says "do not use RandomAccessFile here.")
 
Joe Ess
Bartender
Posts: 9406
12
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Maria Soderberg:
school assignment specifically says "do not use RandomAccessFile here."

Ah, not the "real world".
Well, you can use Reader.mark() to remember a position in a stream then invoke reset() to go back to that position, but this functionality is not required by implementations of Reader and implementations I'm aware of use a temp file, so their performance is nowhere near that of RandomAccessFile.
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
By "wrapper" - in a very generic, lower-case-w way - I mean a little class that holds the stream and does the counting. Imagine your program could write to the stream from any of five places. You'd have to remember to add the counting logic to each of those places. You're mixing up counting with whatever other logic your program has. These could turn into problems fairly quickly.

So think how moving this code to a separate CountingStream would work ...

You could even reuse CountingStream in another project some day. Does that sound worth the effort of a new class?
 
Maria Soderberg
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Definitely worth it! Thanks!
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!