Win a copy of Programmer's Guide to Java SE 8 Oracle Certified Associate (OCA) this week in the OCAJP forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

regex for nameFields: first & last names tested separately

 
Unnsse Khan
Ranch Hand
Posts: 511
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello again,

I am looking for a good regex pattern for first and last names...

Have two TextFields (one for a person's first name and one for the person's last name).

The rules I want to specify are:

1. First letter is always capital and all subsequent letters are lowercase.

2. No symbols (e.g. !#@$%^&*()_+=) except for a hypen (-) are allowed. Only want alpha numbers (A-z).

3. No numbers are allowed.

4. I want it to only test for potential first and last names but with no spaces...

Found this on the Internet:

^[a-zA-Z]+(([\'\,\.\-][a-zA-Z])?[a-zA-Z]*)*$

But Eclipse doesn't like the escape sequences...

When I tried:



Eclipse's problems view spat out:



Many, many thanks!
[ March 20, 2007: Message edited by: Unnsse Khan ]
 
Jeanne Boyarsky
author & internet detective
Marshal
Posts: 34839
369
Eclipse IDE Java VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Unnsse,
In Java, the backslash is a special character. So you need to replace each \ with \\ to make Java happy. The actual regular expression will then contain a single backslash; it will just be visually represented as two.

This gives you:
 
Alan Moore
Ranch Hand
Posts: 262
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A good thing to keep in mind is that regex strings are always "compiled" into Pattern objects at runtime. When you get a compile-time error about escape sequences, it isn't talking about your regex syntax, it's talking about your String literal syntax. As Jeanne said, it's telling you that you failed to escape your backslashes.

In this case, however, you could just as easily fix the error by removing the backslashes. Of the four escaped characters in that regex, only the hyphen has a special meaning inside a character class, but not if it's the first or last character listed. The period loses its usual special meaning, and the apostrophe and the comma never had them. And we should never pass up a chance to reduce the number of backslashes in our regexes. ^_^

It's not really a big deal in a situation like this, where the regex is applied in response to user input and the target strings are relatively short, but that regex is much more complicated and inefficient than it needs to be. The intention is obviously to make sure every punctuation character is followed by at least one letter; here's a clearer, quicker way to express that: I have a much bigger problem with the notion of applying arbitrary, simplistic validation rules to name-entry fields. No computer has yet told me that I'm misspelling my own name, but if one ever does, the owner of that computer will not be doing any business with me if I have any choice in the matter. ^_^
 
Unnsse Khan
Ranch Hand
Posts: 511
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jeanne & Alan,

Thanks for all of the great advice...

I just tried this regex in my application, and I must say that I do have a complaint.

If you type in:

John D.

and hit enter...

the validation breaks!

I want the ability to have dashes (-) and dots (.) inside my regex fields...

Sincerely,
 
Ulf Dittmer
Rancher
Posts: 42968
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The regex is very ASCII-ish. Since Java supports Unicode, it would be just as easy to use "[\\p{L}]" instead of "[a-zA-Z]", and thus not annoy people with umlauts in their names. As Alan said, the validation had better not tell me that I'm misspelling my name if I'm not.
 
Unnsse Khan
Ranch Hand
Posts: 511
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ulf,

Thanks for the advice...

My regex is not this:

String nameRegex = "^[\\p{L}]++(?:[',.-][\\p{L}]++)*+$";

The problem with this regex is that it won't let things pass validation in situations representing:

Joe A.

or even

Joe A

What I want is the ability to put a space (for a middle name) and a dash (-) for names such as:

John-Paul

Many, many thanks!

Sincerely yours,
 
Alan Moore
Ranch Hand
Posts: 262
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Okay, so add a space to the character class, and move the period to the end.
 
Purushoth Thambu
Ranch Hand
Posts: 425
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Isn't it your requirement that first letter should be in upper case? I was (am still) confounded by how the regular expression "^[\\p{L}]++" will ensure that first letter is in uppercase. If your 4 rules still holds good I believe the below one will be the right one.
 
Alan Moore
Ranch Hand
Posts: 262
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I was just fine-tuning the earlier suggestions, but you're right, I left out the initial capital requirement. Also, I put the space in the wrong place; the way I wrote it, it means "an apostrophe or anything in the range of comma through space". That wouldn't even compile, so let's try it again:
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Those initial requirements would also disallow names like "O'Meara", "McNamara" and "de Queiroz". I would think they need to be loosened up anyway. Overly restrictive validation rules create more problems than they solve, in my opinion.
 
Paul Clapham
Sheriff
Posts: 21319
32
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
And there are people who insist on not having capital letters at the beginning of their names. And there are people who have first names like "Billy Joe". And... the list goes on. I agree with Alan Moore and the others who said that the regex is solving a problem that does not exist.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic