• Post Reply Bookmark Topic Watch Topic
  • New Topic

Variable length integer  RSS feed

 
Greenhorn
Posts: 18
Eclipse IDE Opera Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello Ranch.

I'm trying to make a parser to analyse .dem files (Valve Source demo file format).
Currently my problem is that I have to read variable length integers stored in 4 bytes.
However, I have no idea how variable length integers work. Have tried to Google and Youtube with little luck.

I am to make something similar to this ReadVarInt32() c++ method made by Valve. But I cannot read all of the c++ code.

Currently I'm checking if the high-bit is set, with:

However, even if this is done correctly, and the high-bit is set, I don't know what to do.
If anyone could give sample code, give hints, or some place to watch or read about variable length integers I would be very grateful.

- Christoffer Nilsen
 
Ranch Hand
Posts: 310
18
Linux MS IE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You can construct an integer from variable-length bytes with BigInteger, with this constructor. First parameter is a sign, the second parameter is byte array.
 
Christoffer Nilsen
Greenhorn
Posts: 18
Eclipse IDE Opera Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Adam Scheller wrote:You can construct an integer from variable-length bytes with BigInteger, with this constructor. First parameter is a sign, the second parameter is byte array.

Thank you for answering.
While I do not fully understand the signum I tried all -1,0,1 and all of the results I got was "wrong".

Half of the results I get are negative, while none of them should be.


This prints: -48 52 0 0
and new BigInteger(1, data).intValue(); prints: 801898496
 
Sheriff
Posts: 22846
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I think we would have to understand what "variable length integers stored in 4 bytes" means. (Storing variable-length integers in a fixed-length container already gives me cognitive dissonance.) I followed the link you posted and it didn't use the word "variable" anywhere, so it didn't help me. It seemed to say that there were integer values in the data, and it seemed to be referring to C++ coding so probably those are 32-bit integers.

You might want to look at DataInputStream for reading this kind of file; there are a few things to watch out for though. First of all what those C guys refer to as a String is actually an array of bytes to a Java programmer. And there's the little-endian versus big-endian way of storing integers, although as far as I know most modern computers all use the same endian-ness these days.

 
Andrew Polansky
Ranch Hand
Posts: 310
18
Linux MS IE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I got a little lost too. Christoffer, maybe you just need to extract an integer stored in 4 bytes, not a variable-length value?
 
Christoffer Nilsen
Greenhorn
Posts: 18
Eclipse IDE Opera Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Paul Clapham wrote:I think we would have to understand what "variable length integers stored in 4 bytes" means. (Storing variable-length integers in a fixed-length container already gives me cognitive dissonance.) I followed the link you posted and it didn't use the word "variable" anywhere, so it didn't help me. It seemed to say that there were integer values in the data, and it seemed to be referring to C++ coding so probably those are 32-bit integers.

You might want to look at DataInputStream for reading this kind of file; there are a few things to watch out for though. First of all what those C guys refer to as a String is actually an array of bytes to a Java programmer. And there's the little-endian versus big-endian way of storing integers, although as far as I know most modern computers all use the same endian-ness these days.



Thanks for a replay.
And to be honest I'm not sure I have been using the term correctly.
I got this problem when I used the .readInt() method from a DataInputStream object, it did not give me the correct results.

So after a long Google session I found this thread: http://dev.dota2.com/showthread.php?t=32868
Where a person had the same problem and solved it(tho, never revealing the solution). And a user on that forum wrote:
Your treatment of the VarInt32 is still not quite right. It is a variable length integer, so it can be stored in anywhere between 1 and 4 bytes depending on how big the integer actually is. If the highest bit (The & 0x80) is set, then that indicates that there is another byte that follows. If this byte isn't set then it is the end of the integer...snip


Considering I had never heard of variable length integers I was at a loss.

It seems the php example on the Valve site does the following(If thats to any help):
 
Christoffer Nilsen
Greenhorn
Posts: 18
Eclipse IDE Opera Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Adam Scheller wrote:I got a little lost too. Christoffer, maybe you just need to extract an integer stored in 4 bytes, not a variable-length value?

Maybe, to be honest I have no idea, just learned the word from another user on another forum as written in the post above.

My "quest" is the following. I have .dem files which I would like to analyze.
So I started by reading the header. Header having the format shown in the valve site linked to in the first post.

I can extract all String such as: Header, Server name, Client name, Map Name, Game directory by using the following code in the correct "order":

However, if I do something similar for the Integer information in the header I get strange results.
If I use .readInt() I get enormous values, and some negative, such as: (Which is wrong)
67108864 -801898496 68944384 -334692096 -172488704

When I found out this was wrong I went on a Google hunt, and found someone telling another person how it was a variable-length integers, so .readInt() does not work.
However, I never fully understood the code or the concept.
 
Rancher
Posts: 2240
28
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Can you post a sample: the contents of the 4 bytes and the correct int value that should be derived from them?
 
Christoffer Nilsen
Greenhorn
Posts: 18
Eclipse IDE Opera Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Norm Radder wrote:Can you post a sample: the contents of the 4 bytes and the correct int value that should be derived from them?

That's kinda the problem, other than: ticks / Playback Time = ~32, Demo Protocol and Network protocol should be smaller numbers,
I have no idea what the correct values would be, except that they cannot be negative, and most likely some are quite huge numbers.

If I read the 4 bytes with:

And print out each element in the byte array I get the following:
DEMOPROTOCOL:
4 0 0 0
NETWORKPROTOCOL:
-48 52 0 0
TICKS:
4 28 2 0
FRAMES:
-20 13 1 0
SIGNONLENGTH:
-11 -72 8 0
 
Christoffer Nilsen
Greenhorn
Posts: 18
Eclipse IDE Opera Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Double -delete-
 
Paul Clapham
Sheriff
Posts: 22846
43
Eclipse IDE Firefox Browser MySQL Database
  • Likes 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Those examples look like little-endian integers to me. (Notice that there's always a zero at the end, so that's most likely the most significant byte.)

So I would suggest using readInt() from a DataInputStream and getting an integer from those 4 bytes. (Looks like you might have done that already.) Then use the Integer.reverseBytes() method to convert from little-endian to big-endian and see if the numbers you get look like what you're expecting.
 
Christoffer Nilsen
Greenhorn
Posts: 18
Eclipse IDE Opera Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Paul Clapham wrote:Those examples look like little-endian integers to me. (Notice that there's always a zero at the end, so that's most likely the most significant byte.)

So I would suggest using readInt() from a DataInputStream and getting an integer from those 4 bytes. (Looks like you might have done that already.) Then use the Integer.reverseBytes() method to convert from little-endian to big-endian and see if the numbers you get look like what you're expecting.

Thank you a lot, it seems correct. Or at least all the numbers/results are possible answers in the correct format. Which does suggest it being correct.

Now I only need to do the same(find a solution) with my only float value.
Edit: This seems to work:

Also do you know of any place I can read about, or watch a video about this?
Was the term variable length integers wrong in this setting?
 
Andrew Polansky
Ranch Hand
Posts: 310
18
Linux MS IE
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Those are not variable length integers, as those are always stored in 4 bytes. Variable length integer could be longer than that, and every other such integer could have different length.

What are you doing here, is simply reading a binary representation of 4 byte little-endian integers. You can learn more about that on Wikipedia by looking for "binary system" and "little endian" or "big endian" terms.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!