Thanks Beno�t.
My code is deployed in Unix machine. And the end user currently aren't Japanese people, they are windows user(with non kanji keyboard), the application is displaying end user the page in UTF-8 format(using response.setContentType).
So while receiveing request,I m just copying and pasting kanji characters in input boxes (say i m end user),and in
servlet new String(request.getParameter("SomeName").getBytes("8859_1"), "UTF8");
is working perfectly.
But again,this is a grey area for me,this line:
new
String(request.getParameter("SomeName").getBytes("8859_1"), "UTF8");
i assume used to "undo" the improper default
conversion that occurs in getParameter(),and supposedly by calling getBytes with 8859_1, convert the Unicode back into bytes and then re-interpret those bytes correctly, in
this example, as UTF-8.But i read somewhere that, If your incoming request data really is UTF-8 then there may be octets whose values are in the invalid range for
8859-1 or CP1252 and there may be loss,
Now my each window (i m end user here) comes in UTF-8 encoding i.e. {coz that i did that intentionally in all my servlet,running in unix machine), and if i paste kanji, it get stores and get save properly (surely), why and how?
a)Doesn't it means that my request is UTF-8 encoded? Then y the failure is not occuring?
b)If i change my browser encoding to some other then, my data doen't gets stored properly that means some UTF-8 dependeny is there, BUT
c)but when i write request.getCharacterEncoding() in servlet it always prints "null" i.e. it is not UTF-8 may be thats y its working fine.
Now these 3 point a,b,c are self contradictory.
Can anyone clear this to me.
Thanks
-Varun.
[ December 20, 2002: Message edited by: varun Khanna ]