• Post Reply Bookmark Topic Watch Topic
  • New Topic

Signed bytes  RSS feed

 
Nikolay Bychinskiy
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
After 3 years of working on an internal framework that works mostly with binary data all I can say to whoever genius idea was to make bytes in Java signed - burn in hell
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Nikolay Bychinskiy wrote:After 3 years of working on an internal framework that works mostly with binary data all I can say to whoever genius idea was to make bytes in Java signed - burn in hell

What are you doing? I've never found it much of a problem, and you know what they say about bad workmen...

And if it is such a pain, did you ever consider that Java might not be the best language for the job?

Winston
 
Henry Wong
author
Sheriff
Posts: 23295
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
What are you doing? I've never found it much of a problem, and you know what they say about bad workmen...

And if it is such a pain, did you ever consider that Java might not be the best language for the job?


I never found it a problem either, but I have years of both manipulating the bits for graphics engines and for network packets.... but I definitely can see the issue. It does require a comfortable level of understanding of sign extensions, shifting, and the bit wise operators.

Henry
 
Nikolay Bychinskiy
Greenhorn
Posts: 8
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It is not a problem, I understand sign extension and difference between signed and unsigned operations, but really WHY byte is signed in the first place? Is there any application for signed bytes in the real world? It's so easy to forget to use >>> instead of >> or add &0xFF at the end of the expression when converting to int, which causes annoying bugs. And it's not like Java doesn't have unsigned types at all - char is unsigned. I haven't found even single reason to use signed byte, every time you need to use it - you need to do unsigned operations on it. And to add even more - to do any operation on byte Java extends it to an int and using >>> doesn't help at all - because of the sign extension it will work exactly as >> unless you're shifting for more than 24 bits, which doesn't make sense for 8 bit value. So every bit shift expression is like this: (byte)((b & 0xFF) >> 5), when in languages with unsigned bytes b >> 5 would be enough.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Nikolay Bychinskiy wrote:I haven't found even single reason to use signed byte, every time you need to use it - you need to do unsigned operations on it.

Well, just one possibility might be that the basic operational unit in Java is an int, not a byte; so pretty much any operation you do on a byte will result in an int. And since two's-complement arithmetic works just fine with sign extension all you should generally have to do is cast the result back to a byte. I suspect strongly that it's you that's making assumptions about the sign when, very often, you can just forget about it.

Shifting is a different matter (although the same caveat applies - the result will be an int, not a byte), and I hate to say, but you chose to use Java for three years...

Winston
 
Nikolay Bychinskiy
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
Nikolay Bychinskiy wrote:I haven't found even single reason to use signed byte, every time you need to use it - you need to do unsigned operations on it.

Well, just one possibility might be that the basic arithmetic unit in Java is an int, not a byte; so pretty much any operation you do on a byte will result in an int. And since two's-complement works just fine with sign extension all you should generally have to do is cast the result back to a byte. I suspect strongly that it's you that's making assumptions about the sign when, very often, you can just forget about it.

Winston

It doesn't matter what is the basic arithmetic unit in Java. They could just as well make bytes unsigned and convert them to ints without sign extension just like they already do with chars. Can you give a single! example of application for signed bytes? I do not know any nor anyone whom I know. It defeats a reason to have a byte in a language at all.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Nikolay Bychinskiy wrote:It's so easy to forget to use >>> instead of >> or add &0xFF at the end of the expression...

Why? Surely it's a simple matter to create a shift() method that does the job for you? Sounds to me like a design issue...layers of indirection and all...

Winston
 
Nikolay Bychinskiy
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
Nikolay Bychinskiy wrote:It's so easy to forget to use >>> instead of >> or add &0xFF at the end of the expression...

Why? Surely it's a simple matter to create a shift() method that does the job for you? Sound to be like a design issue...layers of indirection and all...

Winston

Yeah, ok, let's do that in Java way and make an BitwiseOperationFactory, which have many implementation, shouldn't we? *sarcasm*
If you need to implement a method for something as simple as shifting bits in a byte - there must be some issue with the language design, no? Shouldn't it already be implemented in a general purpose language such as Java?
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Nikolay Bychinskiy wrote:It doesn't matter what is the basic arithmetic unit in Java. They could just as well make bytes unsigned and convert them to ints without sign extension just like they already do with chars. Can you give a single! example of application for signed bytes? I do not know any nor anyone whom I know.

Sure:
It defeats a reason to have a byte in a language at all.

No. It's a PITA for you, because you're doing all sorts of things with bytes that the designers probably never considered when they were designing a language for OO applications. Bytes, for example, are the standard unit of I/O; and in general, when you're dealing with I/O you really don't care if the unit is signed or not; it's simply 8 bits.

Winston
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Nikolay Bychinskiy wrote:If you need to implement a method for something as simple as shifting bits in a byte - there must be some issue with the language design, no? Shouldn't it already be implemented in a general purpose language such as Java?

But it is: '&' and '>>>' (if that's what you need).

And I dispute whether you are actually using the language for "general purposes". Sounds not, to me.

Fun discussion though. And, as an old C bod, I do take your point; just playing the Devil's advocate...

Winston
 
Nikolay Bychinskiy
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
Nikolay Bychinskiy wrote:It doesn't matter what is the basic arithmetic unit in Java. They could just as well make bytes unsigned and convert them to ints without sign extension just like they already do with chars. Can you give a single! example of application for signed bytes? I do not know any nor anyone whom I know.

Sure:
It defeats a reason to have a byte in a language at all.

No. It's a PITA for you, because you're doing all sorts of things with bytes that the designers probably never considered when they were designing a language for OO applications. Bytes, for example, are the standard unit of I/O; and in general, when you're dealing with I/O you really don't care if the unit is signed or not; it's simply 8 bits.

Winston

Really?

BTW doesn't it contradict your own statement that when working with bytes you 'you really don't care if the unit is signed or not; it's simply 8 bits'. In this case I think code like b & 0x7F is more readable and shows you intentions more clearly, also it also works fine with unsigned bytes. Also you're forgetting that after I/O operations you most likely have to understand what you just received, am I not right? In my case I need to deserialize data from the stream and no there are no public implementation, because this is our internal format which was used for almost 15 years and noone will be changing it to something that is popular today.
 
Nikolay Bychinskiy
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
Nikolay Bychinskiy wrote:If you need to implement a method for something as simple as shifting bits in a byte - there must be some issue with the language design, no? Shouldn't it already be implemented in a general purpose language such as Java?

But it is: '&' and '>>>' (if that's what you need).

And I dispute whether you are actually using the language for "general purposes". Sounds not, to me.

Fun discussion though. And, as an old C bod, I do take your point; just playing the Devil's advocate...

Winston

Haha, maybe
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Nikolay Bychinskiy wrote:Yeah, ok, let's do that in Java way and make an BitwiseOperationFactory, which have many implementation, shouldn't we?

Why not? I wrote one about four years ago, and I find it incredibly useful.

It's possibly also worth mentioning that final utility methods can often be inlined by modern compilers, so you may well not be incurring any processing overhead by doing it.

Winston
 
Jesper de Jong
Java Cowboy
Sheriff
Posts: 16060
88
Android IntelliJ IDE Java Scala Spring
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
When Java was designed in the 1990's, C++ was a very popular language (in fact, I was mainly programming in C++ myself at that time). The designers of the Java programming language wanted to create a language that looked familiar to C++ programmers, but that left out the unnecessary and complicated features of C++.

They thought that unsigned was one of those C++ features that wasn't necessary. So, now we only have signed byte, short, int and long types in Java, and the unsigned keyword cannot be added to Java anymore very easily without potentially breaking backward compatibility.

I wouldn't mind if Java had unsigned. But adding it to the language is much more complicated than just adding a new keyword. It would mean we would suddenly have four new primitive types, and conversion rules between signed and unsigned types would need to be defined very carefully, etc.
 
Nikolay Bychinskiy
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
Nikolay Bychinskiy wrote:Yeah, ok, let's do that in Java way and make an BitwiseOperationFactory, which have many implementation, shouldn't we?

Why not? I wrote one about four years ago, and I find it incredibly useful.

It's possibly also worth mentioning that final utility methods can often be inlined by modern compilers, so you may well not be incurring any processing overhead by doing it.

Winston

Well I actually did it after I fixed the same bug with unintended sign extension for like 100th time. Also I find Guava Unsigned* classes useful, but still I think this is something that should be in the language.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Nikolay Bychinskiy wrote:BTW doesn't it contradict your own statement that when working with bytes you 'you really don't care if the unit is signed or not; it's simply 8 bits'. In this case I think code like b & 0x7F is more readable and shows you intentions more clearly, also it also works fine with unsigned bytes. Also you're forgetting that after I/O operations you most likely have to understand what you just received, am I not right?

Only in cases where some conversion is needed, and in those, does it really matter whether the first bit is a sign or not?

The fact of the matter is that Java's internal representation of text is a char, which is a 16-bit Unicode value; and the correct way to read text is with a Reader. If you're reading in a JPEG image, do you really care if the bytes that you're reading are signed or not?

It also occurs to me that if you absolutely must have unsigned values, why not convert your bytes to chars?

Winston
 
Nikolay Bychinskiy
Greenhorn
Posts: 8
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jesper de Jong wrote:When Java was designed in the 1990's, C++ was a very popular language (in fact, I was mainly programming in C++ myself at that time). The designers of the Java programming language wanted to create a language that looked familiar to C++ programmers, but that left out the unnecessary and complicated features of C++.

They thought that unsigned was one of those C++ features that wasn't necessary. So, now we only have signed byte, short, int and long types in Java, and the unsigned keyword cannot be added to Java anymore very easily without potentially breaking backward compatibility.

I wouldn't mind if Java had unsigned. But adding it to the language is much more complicated than just adding a new keyword. It would mean we would suddenly have four new primitive types, and conversion rules between signed and unsigned types would need to be defined very carefully, etc.

Well first of all there will be only 3 new primitive types in case they are adding unsigned counterpart for every signed type - there is already unsigned short which is char. But there is no need for usigned ints/longs in the most cases but every time you use byte you most likely intend to only do unsigned operations on it or no operations at all. I understand that it's not easy to add a new primitive type, I just don't understand why they choose to add signed byte instead of unsigned? Also I heard they are considering abandoning primitive types in the later versions (I suppose they will still be used internally when it's possible, but they will just no longer be a part of the language and current primitive types will be just synonyms for their object types similar to CLR).

Btw Java 8 already has standard methods for unsigned conversion and unsigned operation. But Java 8 is still not released and... Really? It took them more than 10 years to include these in the standard library?
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Nikolay Bychinskiy wrote:Btw Java 8 already has standard methods for unsigned conversion and unsigned operation. But Java 8 is still not released and... Really? It took them more than 10 years to include these in the standard library?

I say again: your standard library. I think I can safely say that very few Java programmers ever worry about what you are, because what you're doing is simply not "general purpose".

Personally, I'm darn glad I don't have to write all those "unsigned"s any more, and can save my old carpels for better things.

Winston
 
Jesper de Jong
Java Cowboy
Sheriff
Posts: 16060
88
Android IntelliJ IDE Java Scala Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
They probably made byte signed to make it consistent with short, int and long. Maybe it would have been more useful to make it unsigned, for the reasons you mentioned. In C#, a byte is indeed unsigned.

Java 8 is not far off, it will be released next month (18 March 2014).

(Unfortunately that doesn't mean we can immediately start using it at work... the client I'm working for now just switched from Java 6 to Java 7).
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jesper de Jong wrote:They probably made byte signed to make it consistent with short, int and long. Maybe it would have been more useful to make it unsigned, for the reasons you mentioned. In C#, a byte is indeed unsigned.

At the risk of flogging a dead horse, this page contains some interesting info on the subject, in particular an interview snippet from Gosling and also a notation from the Oak 0.2 specification, which suggests that it was thought about (both near the bottom).

Winston
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!