• Post Reply Bookmark Topic Watch Topic
  • New Topic

problem converting unicode characters from native to Java

 
Oren Gross
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
hi all
I have a JNI code written in Objective-C (though I believe I would have the same problem in a C/C++ program) that returns jstring to the Java code. I am using the char* to jstring conversion as specified here http://developer.apple.com/library/mac/#technotes/tn2005/tn2147.html in the "Creating Java Strings From Native Strings" section. My problem is that while on the native side the strings are correct (i.e. native string presents in the standard output correctly) when passed to Java and sent to standard output I get '?' for the non-English characters.

Thanks
 
Paul Clapham
Sheriff
Posts: 21876
36
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
So there are two steps:

1. Convert from native to Java.

2. Send to standard output.

I would suggest testing them one at a time. Right now you're testing the pair together, which means you don't know where to look for problems.
 
Oren Gross
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
but I do know. I tested the output of the NSString, char* and jstring on the native side. All print well and correctly. It is only when I print on the Java side that I get the question marks.
 
Paul Clapham
Sheriff
Posts: 21876
36
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ah, I see. The title of your post indicated that you had already decided that step 1 was the problem, that's why I pointed out the "two steps" issue. But now that you confirm that it wasn't step 1 at all but step 2, we can address the problem.

Unfortunately the best way to solve this problem is to stop using the console, as in many environments it isn't really designed for proper Unicode support. And it doesn't help that in Java, until recently, you had to use System.out to write to the console, and System.out is a PrintStream, about which the documentation says

All characters printed by a PrintStream are converted into bytes using the platform's default character encoding.


However in Java 6 there's the java.io.Console class, which gives you access to a PrintWriter for the console. This might work better for you, unless you were already using it.
 
Oren Gross
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Using console doesn't help either (I was using System.out.print before). Here is what I do on the native side:


Here is the convert:


and on the Java side:
 
Paul Clapham
Sheriff
Posts: 21876
36
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
But you've already confirmed that the native class isn't the problem, right? That would include looking at the result in your pure Java code using some technology other than the console to make sure that there weren't any character encodings and decodings under the cover, and that the Unicode characters that were in the source are the same Unicode characters in your Java code.

After that if the console doesn't support those characters, then there's nothing you can do about that.
 
James Sabre
Ranch Hand
Posts: 781
Java Netbeans IDE Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A '?' instead of the desired character normally means one has a conversion problem and I think this would be best handled by returning a byte array and not a String. This way you can perform byte bytes to String conversion in Java using the correct encoding and not rely on any default encoding.
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!