• Post Reply Bookmark Topic Watch Topic
  • New Topic

File content comparison

 
Dmitry Shekhter
Greenhorn
Posts: 26
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
This is my current predicament... I need to compare contents of two files, and store entries that are missing in fileA, but are present in fileB into fileC. How would i go about comparing the file entries? Let's say that the files are in comma delimited CSV format...
Thanks in advance
Dmitry
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, there are a number of possible strategies. I think we first need some more info about your problem.
Does each line have something that serves as a unique key? So that you can look at line 15 of fileA and line 22 of fileB, and tell that they refer to the same record, even though some of the other data on the line has changed?
Are the files sorted according to this key? Or more generally, are they sorted in any way?
Are these files particularly large? Is is feasible to store all the data from at least one of the files in memory somehow (e.g. in a HashMap or TreeMap) while performing the comparison?
You mention the possibility of lines which are in fileB but not fileA. Will you also need to detect lines which are in fileA but not fileB? Or lines which are in both, but some of the data in the line has changed?
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!