Win a copy of Kotlin in Action this week in the Kotlin forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Simple task with regular expresion  RSS feed

 
Ruslan Salimovich
Greenhorn
Posts: 25
Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Could you please say how to solve this task:

Write and test a regular expression that checks a sentence to see that it begins with a capital letter and ends with a period.
 
Naziru Gelajo
Ranch Hand
Posts: 175
1
Java Netbeans IDE Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Let's assume that we have a String literal stored to the variable s. So you would want to use a conditional statement to test for the first character in that string. You also want to test if it is uppercase. What would you do personally?
 
Ruslan Salimovich
Greenhorn
Posts: 25
Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I tried to write a code that checks whether the first letter is capital as follows:

String str = "Some string";
String regex = "[A-Z]\\w+";

str.mathes(regex);

but this works not correctly.
 
Stephan van Hulst
Saloon Keeper
Posts: 7808
142
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Because "\\w+" matches word characters, and your String consists of more than just word characters. Note that it's also not in your requirements to check for word characters. Here are the requirements:

The string must be a sentence.
It must begin with a capital letter.
It must end with a period.

I would interpret a sentence as "Any character multiple times up to and including the first period encountered".
 
Ruslan Salimovich
Greenhorn
Posts: 25
Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thank you! I have found the solution:
regex = "^[A-Z].*\\.$"
 
Ruslan Salimovich
Greenhorn
Posts: 25
Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Sorry for offtop, how can I do my signature italic style?
 
Stephan van Hulst
Saloon Keeper
Posts: 7808
142
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ruslan Abylkhozhin wrote:regex = "^[A-Z].*\\.$"

Almost! This also matches the following String: "Hello there. I hope you have a nice day". Your regex would match two sentences, while the requirement is that it matches one sentence.

Sorry for offtop, how can I do my signature italic style?

Try BB-code instead of HTML: [i]signature[/i]
 
Knute Snortum
Sheriff
Posts: 4073
112
Chrome Eclipse IDE Java Postgres Database VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Stephan van Hulst wrote:
Ruslan Abylkhozhin wrote:regex = "^[A-Z].*\\.$"

Almost! This also matches the following String: "Hello there. I hope you have a nice day". Your regex would match two sentences, while the requirement is that it matches one sentence.

The "*" metacharacter is "greedy", that is, it will match as much as it can. You have two alternatives: one, make "*" "lazy", that is, match as little as possible, or two, match everything except a period, then match the period.
 
Stephan van Hulst
Saloon Keeper
Posts: 7808
142
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In this case, I would say making the quantifier reluctant is the easiest option.
 
Ruslan Salimovich
Greenhorn
Posts: 25
Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Seems like this one works fine: "^[A-Z][^.]*\\.$"
So if I check "Some sentence." it is ok, and if I check "Some sentence. Some sentence." it is false.

Is there are any other regex that solve this task?

p.s.
Stephan van Hulst, thanks for hint about signature
 
Knute Snortum
Sheriff
Posts: 4073
112
Chrome Eclipse IDE Java Postgres Database VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Your regex looks good to me. What task do you need to solve? I would think that "Some sentence. Some sentence." should be false. Do you want a regex that will test that all sentences in the string start with a capital letter and end with a period? If so, I would try grouping part of the regex.
 
Ruslan Salimovich
Greenhorn
Posts: 25
Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Knute Snortum wrote:Your regex looks good to me. What task do you need to solve? I would think that "Some sentence. Some sentence." should be false. Do you want a regex that will test that all sentences in the string start with a capital letter and end with a period? If so, I would try grouping part of the regex.


The task is:
Write and test a regular expression that checks a sentence to see that it begins with a capital letter and ends with a period. But if there are two sentence it should return false.

So as I see I have solved this task with regex "^[A-Z][^.]*\\.$"
I just want to know if there are some other regexes to solve this task.
 
Stephan van Hulst
Saloon Keeper
Posts: 7808
142
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ruslan Abylkhozhin wrote:Seems like this one works fine: "^[A-Z][^.]*\\.$"

Yes, this regex is just fine. Another solution is "^[A-Z].*?\\.$".

For completeness, if you want to match on *any* capital letter, and not just ASCII, you can use the following regex: "^\\p{javaUpperCase}.*?\\.$"
 
Ruslan Salimovich
Greenhorn
Posts: 25
Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Stephan van Hulst wrote:
Ruslan Abylkhozhin wrote:Seems like this one works fine: "^[A-Z][^.]*\\.$"

Yes, this regex is just fine. Another solution is "^[A-Z].*?\\.$".

For completeness, if you want to match on *any* capital letter, and not just ASCII, you can use the following regex: "^\\p{javaUpperCase}.*?\\.$"


If I apply regex "^[A-Z].*?\\.$" to the "Hello World. I love you." it retrns true, but it has to return false
 
Dave Tolls
Rancher
Posts: 2914
36
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Is there any particular reason you are looking for another solution?

When it comes to regexes I find that getting an answer then walking away without looking back is the best bet...
;)
 
Campbell Ritchie
Marshal
Posts: 55717
163
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ruslan Salimych wrote:. . . If I apply regex "^[A-Z].*?\\.$" to the "Hello World. I love you." it retrns true, but it has to return false
That sounds different from what we thought earlier. You mean the text must be a single sentence? You can probably try not "." many times, which might be ^\\.* (or ^\\.+) but I am not sure about regexes.
 
Stephan van Hulst
Saloon Keeper
Posts: 7808
142
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ruslan Salimych wrote:If I apply regex "^[A-Z].*?\\.$" to the "Hello World. I love you." it retrns true, but it has to return false


Ah, never mind. I forgot that this was a "match" operation, and not a "find" operation. Yes, the regex I proposed won't work for a match.
 
Winston Gutkowski
Bartender
Posts: 10573
65
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ruslan Salimych wrote:Seems like this one works fine: "^[A-Z][^.]*\\.$"...

A couple of tips for you:

Personally, I hate all those "\\"s you have to put in regexes to "escape" metacharacters (of which there are many). You can achieve the same thing in most cases by making them "character expresssions", viz:

  "^[A-Z][^.]*[.]$"

Also: While it doesn't work in this case, the "reluctant" qualifier (?) is good to know about, because it can speed up a regex considerably.

HIH

Winston
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!