• Post Reply Bookmark Topic Watch Topic
  • New Topic

binary file comparison  RSS feed

 
Stephanie Smith
Greenhorn
Posts: 22
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What is the best way to compare two binary files in java?
SHould I just read in the files using streams and compare each byte?

Or should I use checksums? Is one better than the other?
 
Joe Ess
Bartender
Posts: 9406
12
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The only certain way to compare two files is to compare them byte by byte.
Since the checksums are a hash of the file contents there's a chance that vastly different files will generate the same checksum, so they are not particularly useful for proving equality.
Checksums are useful for insuring that a file has not been tampered with when it is stored or transmitted. To do this you'd generate one checksum before transmission and one after, then compare the two values. If the checksums are the same, the file contents should be unchanged.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Using checksums for file comparison may be useful if the files are frequently unequal, as it's usually very quick to see that two files are unequal. Yoiu can also quickly tell if two files are probably equal (often with very high probability, depending on the nature of the checksum), and maybe that's good enough for your application. If not though, the only way to verify that two files really are equal is by comparing them byte-by-byte, as Joe says.

Also note that calculating checksums takes some time too. If you can do it as part of an initial download and then save the result somewhere - or more generally, if you can do it just once per file, and reuse the result multiple times - then it can be useful. But if you're starting from scratch and just need to compare two files (with no previous knowledge of their checksums), using checksums probably won't offer any advantage. It's when you have to do multiple comparisons between files which are frequently not equal, that's when checksums would be more useful.
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!