• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Byte code rep. of chars in string not 16-bit unicode

 
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

the char type in Java is a (unicode) 16-bit type. Many Java documentations (for example SCJP&Dev 2, p.352 bottom line) claim that in Java, Strings are composed of unicode 16-bit characters.

Fine so far.

But the bytecode of the StringQuestion.class file has the explicit string literal "Hello" (without the quotes) hardcoded in 8-bit. Why not in 16-bit (i.e. with ASCII-NUL character ^0) like ^0H^0e^0l^0l^0o?

To make sure this is not due to my vi-appearance on the screen, I have made an "xxd" hexdump on the class file, whose output contains the following line:



Even changing the explicit string literal "Hello" to "H\u0065llo" doesn't make a difference.

Thanks for your answers,
Marco



Note: this code does compile, but you can't run it, of course...
[ September 26, 2004: Message edited by: Marco Loskamp ]
 
author and iconoclast
Posts: 24207
46
Mac OS X Eclipse IDE Chrome
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

Welcome to JavaRanch!

The class file format (which is quite well-documented) uses UTF-8 encoding for Strings. UTF-8 is identical to ASCII for ASCII characters, and uses two or three bytes to encode non-ASCII characters. Saves a lot of space in the U.S.A., anyway.
 
A teeny tiny vulgar attempt to get you to buy our stuff
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic