• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

word / sentence regex pattern ?

 
jay vas
Ranch Hand
Posts: 407
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi guys : I am trying to parse words and sentences in a tokenizer.

Im using a hand coded system :



Any suggestions on a regular expression which is more comprehensive ?
I assume this problem has been solved before .


 
Amit ChaudhariC
Ranch Hand
Posts: 33
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
you can try out something like


Regards,
Amit
 
Campbell Ritchie
Sheriff
Pie
Posts: 50175
79
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Two of those characters are metacharacters, but they appear to work in this context.
 
Rob Spoor
Sheriff
Pie
Posts: 20661
65
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Most meta characters loose their meaning inside character classes. Other meta characters change in meaning (^ is start of input outside, negating inside), others are introduced (- is nothing outside, inside it means range unless it's the first character).
 
Campbell Ritchie
Sheriff
Pie
Posts: 50175
79
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
. . . but I can never remember which is which.

You do realise you can use methods of the Character class like isWhitespace, jay vas?
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic