• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Encoding question

 
Christian Wolf
Greenhorn
Posts: 28
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello.
In my own application, I need to send user input to the server. The application is intended for the Japanese market, so I want to send data that is encoded in SJIS. I started with a pure English application, and sending POST data with application/x-www-form-urlencoded content-type worked quite fine. But with SJIS encoded strings, this looks not that fine anymore.
input=��� �&id=500
So my question: does this work at all with the mentioned content-type or have I to use multipart/form-data where I can specify an encoding for each part? At least the server application seems to recognize the different parameters in the example body string above. But I fear that the SJIS encoded string contains a & that is then treated as delimiter.
I do not use OutputStreamWriter at the momen that could take care of encoding but just write the strings from the TextFields into the output stream.
Christian Wolf
 
Christian Wolf
Greenhorn
Posts: 28
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello.
The issue I mentioned in my mail above has been solved. No problem anymore.
Short explaination: the SJIS (shift-jis) charset I am using includes ASCII-7 as subset. As this is an encoding, it is taken care, that a valid 1-byte character is not prefix of a 2-byte-character. So the problem with the & as delimiter is actually no problem at all.
Then I was able to send data to the server but the received data looked strange. Finally I found out. People adapt examples for their purpose and so did I. In Sun WTK 2.0, the demos MIDlet contained code for HTTP connection. They used code like

to append a new byte that was read from the stream to a StringBuffer. I made the error then that I simply converted this Stringbuffer to String by calling StringBuffer.toString() and wondered why I do not see the Japanese characters that I did input in the MIDlet. Then I realized by looking at verbose debugging output that two single bytes do not merge automatically to one two-byte character.
Solution is to convert the StringBuffer containing the byte representation of the string into a pure byte-array (first exporting the StringBuffer to char-array, then copying char[] to byte[]). This byte array can be used as argument for new String(byte[]) which automatically uses a default encoding, SJIS in my case. Result is new string containing all characters that I also inserted. This is kind of expensive but it seems to be neccessary.
Christian Wolf
 
Christian Wolf
Greenhorn
Posts: 28
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Stupido. Why easy, if it works complicated, too?
Just use InputStreamWriter and OutputStreamWriter. They take care for encoding. You write 2byte-characters and you get 2byte-characters. That's it.
Christian Stupido
 
Michael Yuan
author
Ranch Hand
Posts: 1427
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Very good. I do not deal with non-English languages on cell phones very often -- none of my phone actually support anything other than english. But I think you experience is very valuable and thank you for sharing with us.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic