• Post Reply Bookmark Topic Watch Topic
  • New Topic

Find string in binary file  RSS feed

 
Ranch Hand
Posts: 56
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a binary file which contains a string followed by compressed data. I wrote a java code to read the file into a byte array. The size of the byte array can vary depending on the size of the data and the position of the string can also vary. What is the best way to accomplish this?
Once I know the position of the string, then I can make a new Inputstream of rest of the byte array and read the data, but finding the position of the string has stumped me.

Any help is much appreciated.

-Ravi
 
Sheriff
Posts: 22845
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The best way to accomplish that is to design the contents of the file so that you know the things which you now don't know. One common technique is to prefix the binary portion by, say, a four-byte integer value containing the number of bytes in the binary portion.

But perhaps that has already been done? You could ask the person who gave you the file.
 
Saloon Keeper
Posts: 7993
143
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You say the String is located before the compressed data. Isn't the string at the start of the file then? You have to be more clear about the format of the file.
 
Ravi Shankarappa
Ranch Hand
Posts: 56
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Paul Clapham wrote:The best way to accomplish that is to design the contents of the file so that you know the things which you now don't know. One common technique is to prefix the binary portion by, say, a four-byte integer value containing the number of bytes in the binary portion.

But perhaps that has already been done? You could ask the person who gave you the file.


This is file generated by a vendor's program. Immediately following the "identifier string" are two integer values that define the total number of bytes that follow. Other than that not much is fixed.
 
Ravi Shankarappa
Ranch Hand
Posts: 56
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Stephan van Hulst wrote:You say the String is located before the compressed data. Isn't the string at the start of the file then? You have to be more clear about the format of the file.


Correct, however, the string is not at the start of the file. There are a number of other parameters that precede the "identifier string" and the length of those is not fixed. Hence, the need to find the start of the string.

Since the size of the file is unknown at start of the program, I use the following code to read all the bytes as follows:


So now I have all the bytes in byte array fileContents. What is the position of the string lets say "Here34503450"? The size of the image is 3450 * 3450, and both are represented by 4 byte integer. Depending on the machine the file was created, it could be little-endian or big-endian, and this information is contained at the beginning of the file, so reading those two numbers is easy. However, where does "H" in "Here" is located?

-Ravi

*Note: Edited to correct a mistake
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!