• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Using Scanner need help comparing two files so I can write out data I need.

 
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Trying to compare two files one has 3 million records the other has 250,000
I want to loop through scanner with scanner2 line by line so if there is a match I can then substring file Number and a couple other fields off the file with 3 million records. Currently I can only put in a file number and find one record.
I don't understand what needs to be done to get oldfile2 to take 1st file number loop through oldfile if there is a match write out data then take 2nd file number and so on.

Thank you in advance for your help.



 
Marshal
Posts: 79153
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Can't understand the problem, I am afraid. Please explain it differently.
Do you want to find lines which match in the two files, or lines beginning with 99 99999900?
 
Justin Char
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I want to find the files that match in both files.
It will need to grab the first file number Run through 3million records to see if there is a match.
I will sort them by number so it won't have to go through all 3 million everytime.
Both files are not the same size.
 
Campbell Ritchie
Marshal
Posts: 79153
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
By file, do you mean line? How much space do they occupy in memory? Can you load them simultaneously?

Can you put the lines into a Set<String> and use the methods like retainAll or deleteAll or something?
 
Justin Char
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
By File - Yes, I do mean a line - Sorry

I am trying to read the first 11 characters of the first line from one file and look for matches in the 1st 11 characters of the second file and find all the ones that match and write them out to a file. I know how to write them out to a file.

Just having problems on how I would compare the two files with scanner, or using an array or a for loop. I have no idea on how to do this and have it complete in under 15 minutes.

With scanner I could only get it to allow me to put in a series of numbers and match it to a line on one of the files.
 
Justin Char
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have been working on this and tried the below if stmt.
says there are no matches but I know there are matches.
 
Campbell Ritchie
Marshal
Posts: 79153
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You appear to be comparing something with 11 letters in one place with something with 12 letters elsewhere.
 
Justin Char
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
When I print it out it is eleven and eleven...
I don't even think that is the right way to go about doing it.
 
Ranch Hand
Posts: 423
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If you read whole lines from files, you do not need the scanner - it is easier (and probably much more faster) read lines using BufferedReader,
try this code sinppet:


And I think that the better solution to compare these two files would be to load all the records from the smaller file (250.000) to the HashSet
(not the whole record, but just first 11 chars),
then loop through the second file line by line, check for each record if the characters from this line is in the set, and if yes
- do some other operations on this maching record.
250000 records each of 11 chars (about 20-30 bytes) should fit into memory if you increase the heap size.

 
Justin Char
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I don't understand what you mean. I don't understand how that would work.

I can read a file with bufferedReader. I don't understand how to do the compare with the BufferedReader and hashset?
 
Justin Char
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I still don't understand how to get this to work?
 
reply
    Bookmark Topic Watch Topic
  • New Topic