Gotta watch those pesky backslashes, eh Max?
I suspect efficiency here also depends on the input - are successes or failures going to be more common? Are single-character inputs common, or rare?
There's also a difference in behavior between the two solutions. Max' original solution will not allow 123-45-67, while David's will. It's not 100% clear which of these is intended according to the instructions (which require that we allow "a dash", but say nothing about multiple dashes, except there's an example that shows multiple
consecutive dashes). Maybe it doesn't matter. But my guess is David's soltion is correct. I'd probably formulate it as
"[a-zA-Z0-9]+(\\-+[a-zA-Z0-9]+)*"
or
"[a-zA-Z0-9]++(?:\\-++[a-zA-Z0-9]++)*+"
or
"\\p{Alnum}++(?:\\-++\\p{Alnum}++)*+"
I think the first is probably most understandable to people
now, but the latter forms offer improvements I'd like to see more commonly used. That the possessive forms ++ and *+ aren't really necessary, and may be changed to + and * respectively - but I think in this case they lead to the fastest solution possible, eliminating unnecessary backtracking. Which also helps readability, IMO, assuming the reader is familiar with possessive forms.
[ June 08, 2004: Message edited by: Jim Yingst ]