Win a copy of Murach's Python Programming this week in the Jython/Python forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Size of a String in Bytes  RSS feed

 
Benjamin Hundley
Ranch Hand
Posts: 54
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am currently building an xml file using Java and sending it to a web service that I know nothing about. There is a field that is limited to 8000 bytes. I have a 1749 character String in this field. When I use getBytes("UTF-8").length it returns 1770 (Well below the 8000 byte limit). But this program on the other side keeps telling me I have exceeded their byte limit. I am stumped. Am I measuring the bytes correctly? Could they be decoding using another character set?

This String is in a foreign language so it probably has something to do with weird character sets. Can anyone maybe give me some suggestions of something I might try?
 
James Sabre
Ranch Hand
Posts: 781
Java Netbeans IDE Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Since all UNICODE characters have a utf-8 encoding, the fact that the String is in a foreign language is irrelevant and if your utf-8 encoded String is 1770 bytes then it is 1770 bytes.

In your position I would first get the other side to say exactly what character set they expecting and make sure I used that character encoding. I would then record the bytes I send and get the other side to record the bytes they receive and then do a compare.
 
Rob Spoor
Sheriff
Posts: 20893
81
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Still, a String with 1749 should hardly ever exceed 3498 bytes, as one char should never use more than two bytes. If there is any encoding that requires more than two bytes for a char then that encoding is not efficient.
 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!