Win a copy of Functional Reactive Programming this week in the Other Languages forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Apply Unicode escapes?

 
Peter Chase
Ranch Hand
Posts: 1970
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

Is there an easy way to apply Unicode escapes to a text string, so that each \u#### is replaced by the equivalent Unicode character?

Basically, I need to replicate behaviour of Properties files, without actually using Properties.load().
 
Peter Chase
Ranch Hand
Posts: 1970
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
No-one replied, and my own further researches suggest there probably isn't such an API method. So I wrote my own. I don't think my employer will mind this posting here, for the edification/scrutiny of Ranchers...



Something like that, anyway. Testing may reveal shortcomings.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
One thing to beware here is that it's possible to have a double-backslash escape. So

\u####

is a Unicode escape, but

\\u####

is not. Or even

\\\\\\\\\u####

is a Unicode escape, but

\\\\\\\\\\u####

is not.
 
Peter Chase
Ranch Hand
Posts: 1970
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yes, that's true, and if I was writing a method for a totally-general application, I'd have to deal with that. In my situation, I am pretty sure that \u followed by 4 hex digits will only appear in the string if an escape code is intended.

The most common situation where problems occur with this is where the text being processed is actually an explanation of Unicode escapes! I can be sure my text won't be that.

As it is fairly easy to do, I could perhaps beef-up my regex so that it says not to match, if the text being matched is preceded by another backslash. That's still not perfect, as your loads-of-backslashes examples showed, but it would be a step in the right direction.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic