• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Can HTTP Headers encoded in UTF-8?

 
Ranch Hand
Posts: 214
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

Is it ok to encoded a HTTP in UTF-8?

(I don't mean to JUST specify IN THE HEADER that the message body is encoded in UTF-8...I mean to ACTUALLY encode the whole header IN UTF-8?)

I ask, because if I send the header and the body as ASCII, my web page renders fine and dandy.

If I encode the header as ASCII and the body as UTF-8, my web pages renders fine and dandy and UTF-8 to boot!

If I encode the header as UTF-8 and the body as UTF-8, my web pages I get a script header error.

All thoughts are warmly welcomed!

Cheers,

Simon.
 
author
Posts: 11962
5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It would seem to make sense that the headers themselves would be US-ASCII and that the content's encoding could be defined in one of the headers. Otherwise, implementations would need to "sniff" the encoding used for the headers. Couldn't find anything quicky from the spec, though.
 
Simon Cockayne
Ranch Hand
Posts: 214
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hey Lasse,

I thought UTF-8 was backwardly compatible with ASCII.

See: http://en.wikipedia.org/wiki/Utf-8

Cheers,

Simon.
 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

I thought UTF-8 was backwardly compatible with ASCII.


Only for the bottommost 128 characters (which is all ASCII specifies anyway). Maybe you inadvertently included some characters above 128 in the headers?
 
Simon Cockayne
Ranch Hand
Posts: 214
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Ulf,

This is my header in US (IBM CCSID 37), which works fine:

(Note the header terminates with two newlines characters, represented in EBCIDIC as x'15'.



This is my header converted to UTF-8 (IBM CCSID 1208)...



As you can see the NewLine is represented in the UTF-8 conversion as C285 C285, which is not compatible with ASCII.

What should the newline look like in UTF-8(ASCII), x'0A' x'0D'???

Regards,

Simon.
[ June 08, 2006: Message edited by: Simon Cockayne ]
 
Simon Cockayne
Ranch Hand
Posts: 214
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi all,

So...I found the solution!

My UTF-8 header was sent as below (UTF-8 represented in Hex):

Notice the header is terminated with CR, LF, CR, LF - x'0D' x'0A' x'0D' x'0A'.



Also, in my apache server configuration I set the CGICOnvMode to Binary.

Hey Presto...data (not shown here) renders in the browser and the encoding is UTF-8.

Cheers,

Simon.
 
There's a way to do it better - find it. -Edison. A better tiny ad:
a bit of art, as a gift, that will fit in a stocking
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic