• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

CRC for a file

 
Ranch Hand
Posts: 77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello all y' ranchers !

I have to compare a few files to each other, the best solution is running each file through some kind of a hash algorithm, so the result will be either a hash, or a long or anything that is 16 bit long...
I though of using the CRC32 but it works with 32 bit (as implied by the name...)

Any suggestion will help. I need something simple and reliable.

Thanks a lot.
Dave
 
Bartender
Posts: 9626
16
Mac OS X Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Dave Jones:
. . .or a long or anything that is 16 bit long...



You do realize that an int in Java is 32 bits and a long is 64?
Any reason for the 16 bit requirement?
 
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Dave Jones:
I have to compare a few files to each other, the best solution is running each file through some kind of a hash algorithm, so the result will be either a hash, or a long or anything that is 16 bit long...
I though of using the CRC32 but it works with 32 bit (as implied by the name...)

Any suggestion will help. I need something simple and reliable.



Well, quite frankly, CRC32 is not reliable either -- as it is very possible to have two completely different files match because their CRC are the same.

This is true for any hash that you use. To represent, what is potentially an unlimited amount of data, in 32 bits, and expect it to be completely unique is ridiculous. The purpose of the hash is for the hash to drastically change, when only small changes are encountered -- in effect, to detect small corruptions in the file.

Anyway, if you only want 16 bits, then use the first 16 bits, or the last 16 bits, or every other bit. You'll get more different files to match, but you'll get that with any hash algorithm that you use.

Henry
 
Dave Jones
Ranch Hand
Posts: 77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello, and thank you for your answer.
I can't just take the firat/last 64 bits since they will probably be identicle althou the files are different (the files are similar but not equal) so I need some kind of CRC.
 
Henry Wong
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Dave Jones:
I can't just take the firat/last 64 bits since they will probably be identicle althou the files are different (the files are similar but not equal) so I need some kind of CRC.



Well, I was trying to save you some time, by suggesting that half of a CRC-32 is about as good as a CRC-16.

But if you disagree, then you already answered your question -- use a CRC-16 instead. It's not too hard to implement a CRC-16. I recall I implemented two different CRC-16 routines (many many years ago), in only a couple of hours.

Henry
 
Dave Jones
Ranch Hand
Posts: 77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thenk you Henry !

I already used the CRC32 class, it returns a long and that is good for my uses.
But now, a question has risen:
Why is it called CRC32 if the 'getValue' method returns a long value ??
Common logic says it should be called CRC64. or did I miss something here ?

Thanks again,
Dave
 
Henry Wong
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Dave Jones:
I already used the CRC32 class, it returns a long and that is good for my uses.
But now, a question has risen:
Why is it called CRC32 if the 'getValue' method returns a long value ??
Common logic says it should be called CRC64. or did I miss something here ?



A CRC-32 is a 32 bit *unsigned* value. A java int holds a 32-bit *signed* value. I would venture a guess, that only the lower 32-bits of the long variable is used.

Henry
 
Dave Jones
Ranch Hand
Posts: 77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I agree, it does seem logical
Thanks again Henry
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic