Win a copy of Svelte and Sapper in Action this week in the JavaScript forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Bear Bibeault
  • Junilu Lacar
Sheriffs:
  • Jeanne Boyarsky
  • Tim Cooke
  • Henry Wong
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Tim Holloway
  • salvin francis
  • Frits Walraven
Bartenders:
  • Scott Selikoff
  • Piet Souris
  • Carey Brown

Regex pattern

 
Ranch Hand
Posts: 185
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I have these example urls
http://twitter.com/*
http://twitter.com/*/rs

Now * can be anything like user_name, user.name etc

I could come up with only one pattern of extracting but it returns / as well when it is present. Please help me with a more correct one.

This is my java program


 
Sheriff
Posts: 21999
107
Eclipse IDE Spring VI Editor Chrome Java Ubuntu Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Let's break down your regex:
- (?<=http[s]?://twitter.com/) - a positive lookbehind for http://twitter.com/ and https://twitter.com/. Looks fine to me
- ($|(.*)/|(.*)|\\?=)
--- $ - end of string
--- (.*)/ - anything followed by /
--- (.*) - anything
--- \\?= - a ? followed by =

You clearly specify that you want / inside your match, both in (.*) and in (.*)/
An easy fix: change both occurrences of .* into [^/]*. In other words, anything but a /. That still means you match anything but a / followed by a /, so remove that part. What remains: "(?<=http[s]?://twitter.com/)($|([^/]*)|\\?=)"

By the way, your while loop is actually an if-loop because of the break. So just change it into one.
 
Jacob Sonia
Ranch Hand
Posts: 185
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hey thanks a lot for the reply, it really helped me. Please guide me what book should i read for understanding the basics of regex pattern.

Also i have this problem - Here i want everything after http://abc.com* except http://abc.com/xyz* - means all would be accepted which starts with http://abc.com but the one which starts with http://abc.com/xyz will not be accepted. I tried this, but i think this is not that great, there is some problem to it,it doesn't match the last one.



 
Ranch Hand
Posts: 258
2
IntelliJ IDE Spring Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
There is some url above regular expression
http://www.regular-expressions.info/
http://download.oracle.com/javase/tutorial/essential/regex/


This will fail for


You don't have to escape "/" by using "\\/", simply "/" is ok
if sub-domain (www) is optional, you may want to use "?"
you may want to have a slash "/" after your (ae|com)

It may be easier for you to write down the pattern using pen and paper
before turning it to regular expression.
 
Rob Spoor
Sheriff
Posts: 21999
107
Eclipse IDE Spring VI Editor Chrome Java Ubuntu Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Jacob Sonia wrote:Also i have this problem - Here i want everything after http://abc.com* except http://abc.com/xyz* - means all would be accepted which starts with http://abc.com but the one which starts with http://abc.com/xyz will not be accepted.


Check out java.util.regex.Pattern for negative lookahead. What you basically need:
- http://abc.com
- a negative lookahead for /xyz
- anything else
 
Jacob Sonia
Ranch Hand
Posts: 185
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi, I tried this after looking at java.util.pattern

String regex ="^http:\\/\\/[\\w-]+\\.abc\\.(com)($|[.* && ?![xyz]*])" ;

Doesn't work either:(
 
Raymond Tong
Ranch Hand
Posts: 258
2
IntelliJ IDE Spring Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Jacob Sonia wrote:Hi, I tried this after looking at java.util.pattern

String regex ="^http:\\/\\/[\\w-]+\\.abc\\.(com)($|[.* && ?![xyz]*])" ;

Doesn't work either:(


Here is more details description for regular expression
http://www.regular-expressions.info/lookaround.html
 
Jacob Sonia
Ranch Hand
Posts: 185
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
another try String regex ="^http:\\/\\/[\\w-]+\\.abc\\.(ae|com)($|(?!(/xyz).*).*)" ;
 
Rob Spoor
Sheriff
Posts: 21999
107
Eclipse IDE Spring VI Editor Chrome Java Ubuntu Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You should always check the Javadocs of java.util.regex.Pattern for the syntax. I see you're using a !, but that's not supported in Java. I already told you how to do this, using the negative lookahead.
 
Jacob Sonia
Ranch Hand
Posts: 185
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
But whatever I created is supported. Why do you think that ! Is not supported. For me the pattern works as expected.
 
I am a man of mystery. Mostly because of this tiny ad:
the value of filler advertising in 2020
https://coderanch.com/t/730886/filler-advertising
    Bookmark Topic Watch Topic
  • New Topic