• Post Reply Bookmark Topic Watch Topic
  • New Topic

Cleaning up String data to convert to Float  RSS feed

 
Chuck Beauregard
Greenhorn
Posts: 13
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi.

I've been trying all kinds of things to clean up a String to convert it to Float. A little background, this is a handicap field for golf. For some reason the great minds at the USGA decided to have things like:

0.5
10.7
14.3R
5.6L
NH
etc.

Ideally I like to convert these to:
0.5
10.7
14.3
5.6
null

Any help would be appreciated. Have a great 4th and be safe.
 
Paul Clapham
Sheriff
Posts: 22828
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, you haven't really described the full scope of the inputs you might be given. So it's easy to say something like "Just remove all the alphabetic characters" but it would be better to start designing from a place where we actually understood the requirements.
 
Chuck Beauregard
Greenhorn
Posts: 13
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Paul Clapham wrote:Well, you haven't really described the full scope of the inputs you might be given. So it's easy to say something like "Just remove all the alphabetic characters" but it would be better to start designing from a place where we actually understood the requirements.


Thanks, removing the alphabetical character is exactly what I wanted to do.

Your reply caused me to look for a solution differently and came up with


content in the code blocks is automatically word-wrapped
pnstring = pnstring.replaceAll("\\pL+","");

From Alan Moore in 2004.

While it works fine for all items but the NH.

I understand it is Unicode but I don't understand how "\\pL+" represents all of the non-numeric characters?
 
Paul Clapham
Sheriff
Posts: 22828
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Chuck Beauregard wrote:I understand it is Unicode but I don't understand how "\\pL+" represents all of the non-numeric characters?


It doesn't represent all of the non-numeric characters. From the API documentation:

Both \p{L} and \p{IsL} denote the category of Unicode letters.


There are plenty of non-numeric characters which aren't letters -- the punctuation marks are one obvious category.
 
Mike Simmons
Ranch Hand
Posts: 3090
14
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I guess you're talking about the post from "Alan Moore" here. He's relying on an apparently undocumented feature or Java regex. i assume it is documented somewhere else, perhaps for Perl regex. But apparently "\\pL+" is equivalent to "\\p{L}+" or "\p{Letter}+, where "\\p{L}" or "\\p{Letter}" means any unicode letter, and the "+" means repeated one or more times.

(If anyone sees where the equivalence of "\\pL" and "\\p{L}" is documented for Java regex, please let me know.)
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Chuck Beauregard wrote:Thanks, removing the alphabetical character is exactly what I wanted to do.

Are you sure?

Sounds to me like you might be losing information if you do that - are "L" and "R" a refeence to left- or right-handed players perhaps?.

Personally, I'd set up a Handicap class that contains ALL the information that the USGA says it should; and I would definitely NOT return null for "NH".

Although whether "NH" means "unknown" or "scratch", I have no idea.

Winston
 
Chuck Beauregard
Greenhorn
Posts: 13
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
Chuck Beauregard wrote:Thanks, removing the alphabetical character is exactly what I wanted to do.

Are you sure?

Sounds to me like you might be losing information if you do that - are "L" and "R" a refeence to left- or right-handed players perhaps?.

Personally, I'd set up a Handicap class that contains ALL the information that the USGA says it should; and I would definitely NOT return null for "NH".

Although whether "NH" means "unknown" or "scratch", I have no idea.

Winston


Thanks to all for the information. This is currently an exercise to figure out how to do this, in the actual application I think you're right and I'll have a "handicap" class/method return a float number. BTW the "L", "R" and "NH" are informational items, meaning "lowered", "revised" and "no handicap". Therefore "L" and "R" are superfluous and the NH is a special case which I'm thinking now of handling via the NumericFormatException. We'll see about that later down the road. Thanks again.


 
fred rosenberger
lowercase baba
Bartender
Posts: 12563
49
Chrome Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Chuck Beauregard wrote:the NH is a special case which I'm thinking now of handling via the NumericFormatException.

FWIW...exceptions should be used for things that don't normally happen, or are unexpected...not for something that is perfectly valid. I would think that "No Handicap" is perfectly valid, and in my opinion, should not be an 'exception'. And exception would be if you got something like "Fred is the worst golf player in the world" - which is probably true, but certainly not expected.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
fred rosenberger wrote:And exception would be if you got something like "Fred is the worst golf player in the world" - which is probably true...

I can guarantee you it isn't.

Winston
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!