Win a copy of Python Continuous Integration and Delivery this week in the Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Bear Bibeault
  • Paul Clapham
  • Jeanne Boyarsky
Sheriffs:
  • Devaka Cooray
  • Junilu Lacar
  • Tim Cooke
Saloon Keepers:
  • Tim Moores
  • Ron McLeod
  • Tim Holloway
  • Claude Moore
  • Stephan van Hulst
Bartenders:
  • Winston Gutkowski
  • Carey Brown
  • Frits Walraven

Size of byte array vs byte array converted into string containing the bytes  RSS feed

 
Ranch Hand
Posts: 274
2
Fedora Netbeans IDE Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,  
I have a question which probably isnt specific to java but I'm not sure where it would go.  

My questions is basically asking about data is actually used on a lower level.

If I have an image file, then I convert it into a byte array, then I convert that byte array into a string containing the numbers, which one would be smaller in size?  

To add on to that question, is there a difference in response size/ speed if a byte array is in the response body or a string containing the numbers?

Also, is the answer different if using hex?

Thank you
 
Marshal
Posts: 24184
54
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Al Hobbs wrote:If I have an image file, then I convert it into a byte array, then I convert that byte array into a string containing the numbers, which one would be smaller in size?  



You're proposing to convert each byte in the array to part of a string containing some numbers? It isn't clear what numbers you are talking about, but at any rate each char in a String requires two bytes to store. And no matter what conversion you had in mind, it's likely that each byte will convert into at least one char. I expect you can do the math from there.

To add on to that question, is there a difference in response size/ speed if a byte array is in the response body or a string containing the numbers?



A byte array is most likely going to be the same size whether it's in the response body, but that depends on what kind of a response body you had in mind. However your String object is going to be converted into a series of chars, or maybe a series of bytes, again depending on what kind of a response body you had in mind and how you put the String into it. As for speed, there's a lot of things which might affect that including buffer sizes, packet sizes... depending on what happens to that response body.

Also, is the answer different if using hex?



Probably not, but again it depends on how your conversion works.
 
Al Hobbs
Ranch Hand
Posts: 274
2
Fedora Netbeans IDE Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Paul Clapham wrote:It isn't clear what numbers you are talking about



The numbers would be the char representation of the byte.

I initially thought it would be much bigger if the byte array was converted into char representation.  It would be 2 - 4 times bigger.

I got confused because I don't how bytes are actually stored or used..?  




I think another example where I think about this is if I had an image file and then I converted it to a byte array, then a string representation and then wrote it to a text file, the text file would be bigger.
 
Marshal
Posts: 63314
205
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What are you trying to do? Please explain that and you will get more suggestions.
The size of a text file will be different from the size of a String. It is not a good idea to try converting binary data e.g. images, to text because some of the byte will be converted to control characters; 0x0d will probably cause the most trouble.
 
Al Hobbs
Ranch Hand
Posts: 274
2
Fedora Netbeans IDE Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hey
ive seen people store files as hexadecimal  text and i thought that doing that would increase the needed space by at least 2.  I was thinking about it and got confused on whether it would actually use more space or  not

I dont have any plans on doing that.  It was a purely theoretical question.

Thanks
 
Campbell Ritchie
Marshal
Posts: 63314
205
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Have a look at Joel Spolsky's article about encodings. You can encode text in hex (well, nowadays, two hex digits =two nybbles per byte), and there are several different encodings. The tradition in Java┬« was for a String to “hide” a char[], which represents encoding in the UTF‑16 format. Most text editors use UTF‑8 or various other well‑known formats, e.g. ISO8859‑1. The last time I tried any reflection on a Java┬« String object, I found the value field wa of type byte[], so maybe it is now using UTF‑8. I think (not certain) that the encoding and type of array used differs depending on how many letters are > 0x007f (also called U+007F).
Wheher you use more or less space depends largely on what sort of encoding you use and whether there is any compression. In the case of text, the percentage of number of letters ÷ number of bytes depends on the chraaters used, the language written,etc. I think you will find Spolsky makes suggestions about memory consumption, but show us no calculations.
Note what Spolsky says about UTF‑8 not introducing any very low value bytes into the encoding, so it doesn't introduce 0x00 or 0x04 or anything else that might cause difficulties. You can find 0x00 bytes in UTF‑16.
 
Saloon Keeper
Posts: 5279
143
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A way to represent binary data as text is to encode it using base-64; maybe that is what you've seen. Java has a built-in class for that conversion if you want to play around with that.
 
Saloon Keeper
Posts: 20498
115
Android Eclipse IDE Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Tim Moores wrote:A way to represent binary data as text is to encode it using base-64; maybe that is what you've seen. Java has a built-in class for that conversion if you want to play around with that.



This is the basis of MIME (Multimedia Internet Mail Exchange) encoding and it's used to transmit binary data such as images and audio over email and web channels.

The MIME encoding isn't strictly done to save space, however, it's done because in the early days of the Internet, the various nodes on the net were often very different types of computers. One might be an IBM mainframe, using EBCDIC, another a CDC machine with 66-bit words (I may be off here, but it WASN'T a multiple of 8 bits). Still another might be a DECsystem using ASCII. Byte order might vary. IBM uses continuous byte storage. DEC (and later Intel) used "hopscotch" byte ordering. By keeping things in text form and using only a limited set of characters, the worst you had to deal with was code page translation as the data bounced its way between hosts.
 
The fastest and most reliable components of any system are those that are not there. Tiny ad:
Become a Java guru with IntelliJ IDEA
https://www.jetbrains.com/idea/
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!