Al Hobbs wrote:If I have an image file, then I convert it into a byte array, then I convert that byte array into a string containing the numbers, which one would be smaller in size?
You're proposing to convert each byte in the array to part of a string containing some numbers? It isn't clear what numbers you are talking about, but at any rate each char in a String requires two bytes to store. And no matter what conversion you had in mind, it's likely that each byte will convert into at least one char. I expect you can do the math from there.
To add on to that question, is there a difference in response size/ speed if a byte array is in the response body or a string containing the numbers?
A byte array is most likely going to be the same size whether it's in the response body, but that depends on what kind of a response body you had in mind. However your String object is going to be converted into a series of chars, or maybe a series of bytes, again depending on what kind of a response body you had in mind and how you put the String into it. As for speed, there's a lot of things which might affect that including buffer sizes, packet sizes... depending on what happens to that response body.
Also, is the answer different if using hex?
Probably not, but again it depends on how your conversion works.
Paul Clapham wrote:It isn't clear what numbers you are talking about
The numbers would be the char representation of the byte.
I initially thought it would be much bigger if the byte array was converted into char representation. It would be 2 - 4 times bigger.
I got confused because I don't how bytes are actually stored or used..?
I think another example where I think about this is if I had an image file and then I converted it to a byte array, then a string representation and then wrote it to a text file, the text file would be bigger.
What are you trying to do? Please explain that and you will get more suggestions.
The size of a text file will be different from the size of a String. It is not a good idea to try converting binary data e.g. images, to text because some of the byte will be converted to control characters; 0x0d will probably cause the most trouble.
ive seen people store files as hexadecimal text and i thought that doing that would increase the needed space by at least 2. I was thinking about it and got confused on whether it would actually use more space or not
I dont have any plans on doing that. It was a purely theoretical question.
It's everyday bro
posted 2 months ago
Have a look at Joel Spolsky's article about encodings. You can encode text in hex (well, nowadays, two hex digits =two nybbles per byte), and there are several different encodings. The tradition in Java® was for a String to “hide” a char, which represents encoding in the UTF‑16 format. Most text editors use UTF‑8 or various other well‑known formats, e.g. ISO8859‑1. The last time I tried any reflection on a Java® String object, I found the value field wa of type byte, so maybe it is now using UTF‑8. I think (not certain) that the encoding and type of array used differs depending on how many letters are > 0x007f (also called U+007F).
Wheher you use more or less space depends largely on what sort of encoding you use and whether there is any compression. In the case of text, the percentage of number of letters ÷ number of bytes depends on the chraaters used, the language written,etc. I think you will find Spolsky makes suggestions about memory consumption, but show us no calculations.
Note what Spolsky says about UTF‑8 not introducing any very low value bytes into the encoding, so it doesn't introduce 0x00 or 0x04 or anything else that might cause difficulties. You can find 0x00 bytes in UTF‑16.
Tim Moores wrote:A way to represent binary data as text is to encode it using base-64; maybe that is what you've seen. Java has a built-in class for that conversion if you want to play around with that.
This is the basis of MIME (Multimedia Internet Mail Exchange) encoding and it's used to transmit binary data such as images and audio over email and web channels.
The MIME encoding isn't strictly done to save space, however, it's done because in the early days of the Internet, the various nodes on the net were often very different types of computers. One might be an IBM mainframe, using EBCDIC, another a CDC machine with 66-bit words (I may be off here, but it WASN'T a multiple of 8 bits). Still another might be a DECsystem using ASCII. Byte order might vary. IBM uses continuous byte storage. DEC (and later Intel) used "hopscotch" byte ordering. By keeping things in text form and using only a limited set of characters, the worst you had to deal with was code page translation as the data bounced its way between hosts.
The fastest and most reliable components of any system are those that are not there. Tiny ad: