• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

writeUTF() and writeBytes()

 
Ranch Hand
Posts: 128
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
In DataInput interface, there are writeUTF(String str) and writeBytes(String s) methods.

There is a sentence in instructions.html,

All text values, and all fields (which are text only), contain only 8 bit characters, null terminated if less than the maximum length for the field. The character encoding is 8 bit US ASCII.


According to request of instructions,which one is more appropriate to write String into file?

It looks simple but I have been confused for a long time.Please comment and clarify!
 
Ranch Hand
Posts: 531
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Richard Jackson:
In DataInput interface, there are writeUTF(String str) and writeBytes(String s) methods.

There is a sentence in instructions.html,

According to request of instructions,which one is more appropriate to write String into file?

It looks simple but I have been confused for a long time.Please comment and clarify!



It's writeBytes. Converting from an ASCII String is thus:

byte[] bytes = "".toBytes("ASCII");

To a String:

byte[] bytes = new byte[recordLength];
raf.readFully(bytes);

String string = new String(bytes, "ASCII");
 
Ranch Hand
Posts: 1392
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Here is a test to compare writeBytes and writeUTF.

0 14 82 105 99 104 97 114 100 74 97 99 107 115 111 110
82 105 99 104 97 114 100 74 97 99 107 115 111 110

0 6 -50 -79 -50 -78 -50 -77
-79 -78 �77

writeUTF starts with two bytes for the length of the following data.
writeBytes discards the high-order 8 bits of a 16-bit Unicode character.

writeUTF also inserts extra bits.
Greek alpha is \u03b1
Display as binary 0000 0011 1011 0001
Another view 00000 01110 110001
UTF-8 drops some 0's and inserts 110 and 10: 110 01110 10 110001
Another view 11001110 10110001
Display as decimal is �50 -79

I guess we don't want to use writeUTF.
[ August 27, 2004: Message edited by: Marlene Miller ]
 
Marlene Miller
Ranch Hand
Posts: 1392
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I�ve been using

byte[] b = ...
String s = new String(b);

String s = ...
byte[] b = s.getBytes();

The String class converts Unicode characters to bytes using the platform�s default character set. One could use a different character set by adding another parameter.

I guess I like the idea of converting from one character set (Unicode) to another, rather than dropping the high-order 8 bits.
[ August 27, 2004: Message edited by: Marlene Miller ]
 
Richard Jackson
Ranch Hand
Posts: 128
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks all of you.

I just read the posts after I got through my two weekend days.


The constructor of String class contains two arguments, the second is charsetName.
According to Charset API, we can write it as "US-ASCII" or "UTF-8".

Which one is right in this code? Please comment continously. :roll:
[ August 30, 2004: Message edited by: Richard Jackson ]
 
Anton Golovin
Ranch Hand
Posts: 531
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
ASCII works fine in my code; converts bytes into English admirably.
 
Richard Jackson
Ranch Hand
Posts: 128
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I do same thing like Anton.
According to instructions file,

The character encoding is 8 bit US ASCII.



I modify the line of code as follows,


Am I right?
 
Ranch Hand
Posts: 86
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

In the Java API, Charset, it US-ASCII listed as seven-bit ASCII, not 8. It doesn't have an 8-bit US-ASCII listed as being guaranteed to be on every Java implementation. Does this mean that the assignment spec actually means that the program is not guaranteed to run identically on every system, as it uses an encoding that might not be supported?

I suppose you could manually define your own Charset to be the 8-bit US ASCII, or some such thing, but I somehow really doubt that is what the assessors want, as it seems far beyond the scope of the assignment to me.

What about ISO-8859-1, which is listed. Could that be the 8-bit US ASCII they mean?

If anyone has any thoughts I would be greatful.


Michal
 
Greenhorn
Posts: 11
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi All,
I'm currently not specifying a char set on the way in or out, and don't seem to be experiencing any problems..

tnx
 
Ranch Hand
Posts: 1033
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Ed Green:
Hi All,
I'm currently not specifying a char set on the way in or out, and don't seem to be experiencing any problems..

tnx



Not specifying an encoding is incorrect, you will likely lose marks as that requests your platform default encoding, determined by Locale. Using a specific encoding gives you another Exception to handle. Here's what I do:
 
Michal Charemza
Ranch Hand
Posts: 86
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Peter,

Firstly, have you thought about coverting to a Sring and then using trim() and indexOf() to remove spaces and find the zero delimeter? This may result in cleaner code - although I'm not sure about the efficiency. I think I will use the String methods in mine as it is cleaner, and it is not "re-inventing the wheel".

Also, according to the Charset api, the ISO-8859-1 charset must be supported by every implementation of the Java platform. Does that mean that the UnsupportedEncodingException should never be thrown?

Perhaps an assert(false) is good here instead. I'm not really sure about assertions though, beyond what was required for the programmer exam.

Does anyone think that putting an "assert(false)" is a bad idea in a catch clause? I know I just suggested it, but the assertion and exception together do seem a bit pointless somehow: the exception is supposed to do things in case things go wrong, but the assert(false) in there means that things should never go wrong.

Michal
[ September 02, 2004: Message edited by: Michal Charemza ]
 
peter wooster
Ranch Hand
Posts: 1033
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Michal Charemza:
Hi Peter,

Firstly, have you thought about coverting to a Sring and then using trim() and indexOf() to remove spaces and find the zero delimeter? This may result in cleaner code - although I'm not sure about the efficiency. I think I will use the String methods in mine as it is cleaner, and it is not "re-inventing the wheel".

Also, according to the Charset api, the ISO-8859-1 charset must be supported by every implementation of the Java platform. Does that mean that the UnsupportedEncodingException should never be thrown?

Perhaps an assert(false) is good here instead. I'm not really sure about assertions though, beyond what was required for the programmer exam.

Does anyone think that putting an "assert(false)" is a bad idea in a catch clause? I know I just suggested it, but the assertion and exception together do seem a bit pointless somehow: the exception is supposed to do things in case things go wrong, but the assert(false) in there means that things should never go wrong.

Michal

[ September 02, 2004: Message edited by: Michal Charemza ][/QB]



Michal
I agree, the exception should never be thrown, and you could convert to the String first and then use indexOf and trim. The simple while loop is likely to be faster unless indexOf is implemented using native code, which it might be if you use a character argument. The trim would also remove leading blanks, probably a good thing in this application.

Exceptions that should never occur should probably be chained into a runtime exception. Assertions should not be used for anything that isn't a program error meant to be caught during testing, since they are not enabled in production use.
/peter
 
Michal Charemza
Ranch Hand
Posts: 86
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by peter wooster:

Assertions ... meant to be caught during testing, since they are not enabled in production use.
/peter



Ah yes, I forgot about this point. So soon after my programmers exam... oh well.

Michal
 
mooooooo ..... tiny ad ....
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic