Win a copy of Programmer's Guide to Java SE 8 Oracle Certified Associate (OCA) this week in the OCAJP forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Problem converting C/C++ unsigned char to JAVA

 
Ravi Kumar
Greenhorn
Posts: 16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
HI All,

I searched the forum for an answer, but could not get through.
Any help is greatly appreciated.

The problem with unsigned char.
I am reading a PPM image file which has data in ASCII.

For a character, eg. '†' ,
In JAVA, after reading it as char and typecasting into int its value is 8224.
In C/C++, after reading it as a unsigned char and typecasting into int its value is 160.

How would i read in JAVA so as to get value 160 ?

The followng C++ code


Thank you.
 
Rob Spoor
Sheriff
Pie
Posts: 20605
60
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ravi Kumar wrote:In JAVA, after reading it as char and typecasting into int its value is 8224.

There's your #1 mistake. A C char is only 1 byte in size, so you would need to use a Java byte for that. That would return (byte)160 which is actually the -96 you saw before. To turn that into a char you can add 256 to the byte: c will now have the char representation of ASCII 160.
 
Ravi Kumar
Greenhorn
Posts: 16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thank you Rob for the prompt reply.

Normal typecasting works in JAVA for most of the characters.
The following code in JAVA

Following are some exceptions
8224 †
8226 •
8800 ≠
8482 ™
8710 ∆
8211 –
8221 ”
8216 ‘
9674 ◊
8260 ⁄
8249 ‹
8249 ‹
8734 ∞
8747 ∫
8364 €
8730 √
8804 ≤

Following are some good ones
94 ^
102 f
112 p
119 w
126 ~
196 Ä
122 z
197 Å
197 Å

Please suggest any help.
Thank you.
 
Campbell Ritchie
Sheriff
Pie
Posts: 49733
69
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ravi Kumar wrote: . . . data in ASCII.

For a character, eg. '†' . . .
That's not ASCII at all. ACSII only goes as far as 0x7f = 127.
The † character is 8224, well actually 0x2020, since Unicode characters are usually denominated in hexadecimal. You would find the casting much easier to understand in hex, so try the %x and %c tags after the printf method. Note you will have to cast to a char to use %c.

Anyway, this is much too difficult a question for "beginning Java", so I shall move it.
 
Ravi Kumar
Greenhorn
Posts: 16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Campbell,
Thank you for the response.
I was wrong initially. The data is is ASCII/Extended ASCII, for the values are from 0-255.

In JAVA, how would i get value 160 out of the char † .
 
Rob Spoor
Sheriff
Pie
Posts: 20605
60
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Cast it to int. In Java, char is nothing more than an unsigned 16-bit number with special support when printing to the screen.
 
Ravi Kumar
Greenhorn
Posts: 16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thank you Rob.
That is where is hit the problem. When i simply cast the char † , I get a value of 8224.

I took help and got a solution:
I think this might help some others.

In JAVA the default encoding in UNICODE. for which the symbol † gets the value of 8224.
In MY CASE i need 160. So i need to find an appropriate charset.

I wrote a small method to get the charset. I found a hit for -96 (256-96=160).
Below is the function


The output i got is

Found: MacRoman
Found: x-MacCentralEurope
Found: x-MacCroatian
Found: x-MacCyrillic
Found: x-MacGreek
Found: x-MacRomania
Found: x-MacTurkish
Found: x-MacUkraine

Thanks everyone for your support.
 
Campbell Ritchie
Sheriff
Pie
Posts: 49733
69
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Close examination of the hex value of † shows that it doesn't have 0xa0 (= 160 decimal) anywhere in. So I can't see how you expect to get 160 out of it. Most likely you are dealing with an encoding; maybe in UTF-8 there is 0xa0 in it somewhere. Look at this Joel Spolsky article for more about UTF-8.
 
Campbell Ritchie
Sheriff
Pie
Posts: 49733
69
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
It would appear there is in fact a 160 = 0xa0 in †. Try the program yourself; it is really easy to run . . . but it only works in FORTH
 
Campbell Ritchie
Sheriff
Pie
Posts: 49733
69
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Oh, I forgot: you need the head word too
 
Ravi Kumar
Greenhorn
Posts: 16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thank you Campbell, for your time and support.
 
Campbell Ritchie
Sheriff
Pie
Posts: 49733
69
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You're welcome

And I presume you have worked out which encoding to use from the figures I showed earlier??
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic