Trying to compare two files one has 3 million records the other has 250,000
I want to loop through scanner with scanner2 line by line so if there is a match I can then substring file Number and a couple other fields off the file with 3 million records. Currently I can only put in a file number and find one record.
I don't understand what needs to be done to get oldfile2 to take 1st file number loop through oldfile if there is a match write out data then take 2nd file number and so on.
Can't understand the problem, I am afraid. Please explain it differently.
Do you want to find lines which match in the two files, or lines beginning with 99 99999900?
I want to find the files that match in both files.
It will need to grab the first file number Run through 3million records to see if there is a match.
I will sort them by number so it won't have to go through all 3 million everytime.
Both files are not the same size.
I am trying to read the first 11 characters of the first line from one file and look for matches in the 1st 11 characters of the second file and find all the ones that match and write them out to a file. I know how to write them out to a file.
Just having problems on how I would compare the two files with scanner, or using an array or a for loop. I have no idea on how to do this and have it complete in under 15 minutes.
With scanner I could only get it to allow me to put in a series of numbers and match it to a line on one of the files.
If you read whole lines from files, you do not need the scanner - it is easier (and probably much more faster) read lines using BufferedReader,
try this code sinppet:
And I think that the better solution to compare these two files would be to load all the records from the smaller file (250.000) to the HashSet
(not the whole record, but just first 11 chars),
then loop through the second file line by line, check for each record if the characters from this line is in the set, and if yes
- do some other operations on this maching record.
250000 records each of 11 chars (about 20-30 bytes) should fit into memory if you increase the heap size.