Win a copy of Functional Reactive Programming this week in the Other Languages forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Using Scanner need help comparing two files so I can write out data I need.

 
Justin Char
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Trying to compare two files one has 3 million records the other has 250,000
I want to loop through scanner with scanner2 line by line so if there is a match I can then substring file Number and a couple other fields off the file with 3 million records. Currently I can only put in a file number and find one record.
I don't understand what needs to be done to get oldfile2 to take 1st file number loop through oldfile if there is a match write out data then take 2nd file number and so on.

Thank you in advance for your help.



 
Campbell Ritchie
Sheriff
Pie
Posts: 50240
79
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Can't understand the problem, I am afraid. Please explain it differently.
Do you want to find lines which match in the two files, or lines beginning with 99 99999900?
 
Justin Char
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I want to find the files that match in both files.
It will need to grab the first file number Run through 3million records to see if there is a match.
I will sort them by number so it won't have to go through all 3 million everytime.
Both files are not the same size.
 
Campbell Ritchie
Sheriff
Pie
Posts: 50240
79
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
By file, do you mean line? How much space do they occupy in memory? Can you load them simultaneously?

Can you put the lines into a Set<String> and use the methods like retainAll or deleteAll or something?
 
Justin Char
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
By File - Yes, I do mean a line - Sorry

I am trying to read the first 11 characters of the first line from one file and look for matches in the 1st 11 characters of the second file and find all the ones that match and write them out to a file. I know how to write them out to a file.

Just having problems on how I would compare the two files with scanner, or using an array or a for loop. I have no idea on how to do this and have it complete in under 15 minutes.

With scanner I could only get it to allow me to put in a series of numbers and match it to a line on one of the files.
 
Justin Char
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have been working on this and tried the below if stmt.
says there are no matches but I know there are matches.
 
Campbell Ritchie
Sheriff
Pie
Posts: 50240
79
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You appear to be comparing something with 11 letters in one place with something with 12 letters elsewhere.
 
Justin Char
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
When I print it out it is eleven and eleven...
I don't even think that is the right way to go about doing it.
 
Ireneusz Kordal
Ranch Hand
Posts: 423
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you read whole lines from files, you do not need the scanner - it is easier (and probably much more faster) read lines using BufferedReader,
try this code sinppet:


And I think that the better solution to compare these two files would be to load all the records from the smaller file (250.000) to the HashSet
(not the whole record, but just first 11 chars),
then loop through the second file line by line, check for each record if the characters from this line is in the set, and if yes
- do some other operations on this maching record.
250000 records each of 11 chars (about 20-30 bytes) should fit into memory if you increase the heap size.

 
Justin Char
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I don't understand what you mean. I don't understand how that would work.

I can read a file with bufferedReader. I don't understand how to do the compare with the BufferedReader and hashset?
 
Justin Char
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I still don't understand how to get this to work?
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic