Win a copy of Cross-Platform Desktop Applications: Using Node, Electron, and NW.js this week in the JavaScript forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

improve program execution  RSS feed

 
leon matthew
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
My program has to compare to sets of text files (15,000 in each). File1 is sorted, File2 is unsorted. I've stored File1 in a HashSet and File2 is being read in a word at a time and then HashSet is checked to see if the word occurs in the File1:
Is there a faster way to do this (fast meaning total time of the program to run)?
// lr and qr are BufferedReaders
while (lr.ready())
{
list.add(lr.readLine());
}
while (qr.ready())
{
// Read in each word from the 10,000
word = qr.readLine();
System.out.print(word + " ");
// does the list contain the word?
// print out the word with a 'y' or 'n'
System.out.println(list.contains(word) ? 'y' : 'n');
}
[This message has been edited by leon matthew (edited November 02, 2001).]
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
OK, from you other posts I know that "list" is in fact the HashSet you mention. I still think you're asking for trouble by giving it a misleading name like that, but to each his own.
I doubt you can really improve on the performance of the code you show - this algorithm is probably the best you'll get, if you want to print tothe screen. My gut reaction is to say that this is probably going to be limited most by the time it takes to write all that output to your screen (especially if your monitor limits the scrolling speed). Try redirecting the output to a file instead, and see if the performance improves.
Other than that, if you're not happy with the performance, it might help to figure out which parts of the program are taking up the time. A profiler would be useful here - you can run java using the -Xprof or -Xrunhprof flags (for JDK 1.3 for Windows, at least) to get some useful output. OK, it may take a while to figure out what the output means, but it will help you in the long run. Or you can find a commercial profiler (maybe with a free trial version?). I haven't used any of these in quite some time, so perhaps others can suggest your best options.
Aside from profilers though, there are some lower-tech methods to find out which parts of the program are slowing you down. You can use System.currentTimeMillis() to find out how long it takes to load the first file into the list, vs. the second. Then comment out the println() statements (leaving a list.contains() behind), and run again. The difference will tell you how much time is being spent on printing your results. Then comment out the list.contains() and list.add() sections, and run again. The difference tells you how much time is spent on accessing the HashSet. The remaining time is just the time to read each line in the two files. Thus you can learn which parts of the program are most significant, and focus your efforts there.
Incidentally it looks like you're misusing the ready() method defined by Reader. It will return false when the input stream has ended, true - but it may also return false for other reasons, such as when new input is temporarily unavailable due to network lag, or the fact that the hard drive is currently busy reading some other part of the disk for some other job. Most of the time, this is probably not an issue - but this is the sort of bug that sneaks up on you when you're not expecting it, and is thus very hard to diagnose. The standard way to detect the end-of-file using a BufferedReader is by checking the return value of readLine() - if it's null, there is nothing more to be read.
<pre> String line;
while ((line = lr.readLine()) != null) {
list.add(line());
}</pre>
Also, there are new I/O classes in JDK 1.4 which are supposed to be faster, for some situations at least. I don't know if these will help you at all - but it might be worth looking in to.
 
leon matthew
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ok, thanks, i basically wanted to find out if i could possible improve the speed the program runs at or if i'd be wasting my time trying to.
Thanks
PS I have changed the 'list' to 'setL' :-)
[This message has been edited by leon matthew (edited November 03, 2001).]
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!