• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Ascii - Space characters

 
Ranch Hand
Posts: 209
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

there are space characters 160 (non breaking space character) and 32 (space) - these represent a space characters but are distinct different individual characters,

is there any library that can be used to convert 160 (non breaking space character) into 32 (space) characters - this is for string comparison purposes,

Thanks in Advance,

Niall
 
Saloon Keeper
Posts: 27763
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Probably the easiest way to do this is to use String's replace() method to replace all nbsp characters with space characters before comparing.
 
Bartender
Posts: 10780
71
Hibernate Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Niall Loughnane wrote:is there any library that can be used to convert 160 (non breaking space character) into 32 (space) characters - this is for string comparison purposes,


Erm?
Don't look for complexity where none exists.

Winston
 
Marshal
Posts: 28193
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Niall Loughnane wrote:there are space characters 160 (non breaking space character) and 32 (space) - these represent a space characters but are distinct different individual characters,



That isn't correct. ASCII only defines characters in the range from 0 to 127. Now, Unicode does declare 160 as a non-breaking space character, but then it declares a whole lot of other characters as space characters as well. Here is a document which lists twenty of them but there could be others. As far as I can see the Unicode normalization algorithms don't do anything with those various space characters -- and speaking of normalization, have you built that into your specialized string comparison?
 
Tim Holloway
Saloon Keeper
Posts: 27763
196
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Paul is correct. The original American Standard Code for Information Interchange is a 7-bit code. The 8th bit was reserved for use as a parity bit for use with devices such as Teletype™ machines. The classic old modem settings "8N1" reflect that, indicating 8 data bits, no parity bit, 1 stop bit (2 stop bits were needed for some slower devices). "7E2" would be 7 data bits with even parity, 2 stop bits.

When the IBM PC became popular, a new de facto standard was defined: ASCIIZ which designated uses for characters with the 8th bit set. Graphics, accented text (including umlauts, etc. And the non-break space for typesetting.
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic