Forums Register Login

Is this regular expression good enough to check http or https hyperlinks?

+Pie Number of slices to send: Send

Please refer to below regular expression

message = message.replaceAll("(?:https?|http?)://[\\w/%.\\-?&=!#]+(?!.*\\[/)",
"$0");

I also notice that certain links like http://www.google.com.sg/#hl=en&output=search&sclient=psy-ab&q=test&oq=&aq=&aqi=&aql=&gs_sm=&gs_upl=&gs_l=&psj=1&bav=on.2,or.r_gc.r_pw.r_qf.,cf.osb&fp=37e992000c3eb140&biw=1366&bih=638 is not able to convert to html hyperlink successfully.

Can anyone advise how to improve the regular expression?


+Pie Number of slices to send: Send
 

Ronald Mee wrote:Can anyone advise how to improve the regular expression?


Well the start looks overdone to me:
"https?://"
should be sufficient if you just want http-based URLs; although there are other protocols.

As far as the rest is concerned, I don't know enough about URL rules to comment.

Winston
+Pie Number of slices to send: Send
Thanks for input.

Can anyone assist to explain how does this regular expression work exactly?

message = message.replaceAll("(?:https?|http?)://[\\w/%.\\-?&=!#]+(?!.*\\[/)",
"$0");

And also how can i improve it to match the urls.

E.g. like how to make it match url like

http://www.google.com.sg/#hl=en&output=search&sclient=psy-ab&q=test&oq=&aq=&aqi=&aql=&gs_sm=&gs_upl=&gs_l=&psj=1&bav=on.2,or.r_gc.r_pw.r_qf.,cf.osb&fp=37e992000c3eb140&biw=1366&bih=638

Or url with special characters like !#&'()*+,-./:;=?@[]_~$
+Pie Number of slices to send: Send
 

And also how can i improve it to match the urls.


What do you mean?

writing a program is as much about designing/deciding what you want to do as much as writing the code. providing an example does not make a spec.

If I said

"I want to write a program that generates a series of numbers. For example, it should print numbers like 1, 2..."

That's really not enough information to write code. Do I mean positive integers? powers of 2 greater than/equal to 1? the numbers on a clock face?

Why does Winston's suggestion of "https?://" not work? If you want to match the entire string up to a space, change it to something like "https?://.+[^ ]"
+Pie Number of slices to send: Send
Hi sorry if my reply is not very clear.

Basically i have content that is enter into a textarea like this post i am posting.

I want the java regular expresion to be able to autmatically detect the links and convert them to html hyperlinks which are clickable.

|http://naishe.blogspot.com|
|http://tw.com/#!/someTEXTs|
|http://ts123t1.rapi.com/#!download|13321|1313|fairy_tale.mp4|
|http://www.google.com|
|https://www.google.com|
|google.com|
|google.com|
|google.com/test|
|123.com/test|
|ex-ample.com|
|http://ex-ample.com/test-url_chars?param1=val1&;par2=val+with%20spaces|

as you can see alot of forums posts are able to do that. But i been having problem finding a regular expression that will work for all the urls.

Can advise further?

+Pie Number of slices to send: Send
any people can share some advises?
+Pie Number of slices to send: Send
 

Ronald Mee wrote:any people can share some advises?



Well, let's take the regex, "(?:https?|http?)://[\\w/%.\\-?&=!#]+(?!.*\\[/)", and take a look at the components, shall we ???



(?:https?|http?) -- as already mentioned, this part will also match "htt", which isn't a valid protocol type -- see previous posts.

:// -- matches a colon followed by two slashes

[\\w/%.\\-?&=!#]+ -- matches one or more of any of characters on that list. IMO, I doubt that this is right, as there is no checking to see if the url is well formed, just checking to see if certain characters are used.

(?!.*\\[/) -- a negative lookahead past the url (zero or more characters away) for a square open bracket and forward slash. What is the purpose for this?

Henry
+Pie Number of slices to send: Send
 

Ronald Mee wrote:any people can share some advises?


Yes. There is no substitute for research.

Unless you can find a precise definition of URL rules (and it seems to me that here might be a good place to start), you will never be able to create a regex for parsing a URL (or, indeed, verify that someone else's suggestion for one is correct).

As I said before, I don't know enough to comment on anything but the prefix.

Winston
+Pie Number of slices to send: Send
Is there any regular expression forum on java which i can post my query to for more relevant answers?
+Pie Number of slices to send: Send
 

Ronald Mee wrote:Is there any regular expression forum on java which i can post my query to for more relevant answers?



Not sure what you are asking.... but (1) did you understand any of the issues raised in the topic about your regular expression, and addressed them? If you did, one option is to post your changes, and we can give you more hints. And (2) did you read the link provided by Winston, have a better understanding of what you want? If you do, then post those requirements here, and maybe we can give you hints towards what you want.

To be honest, your last five or six posts seems to not add anything to the conversation. If you want to advanced your solution, you have to show some effort, the ranch is NotACodeMill.

Henry
That's my roommate. He's kinda weird, but he always pays his half of the rent. And he gave me this tiny ad:
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com


reply
reply
This thread has been viewed 2529 times.
Similar Threads
Doubt on Regular Expression
How can I add error handling facilities to this code?
Need help on Unique Javascript validation
Regular Expression issue
Validating IPv6 url using regex
More...

All times above are in ranch (not your local) time.
The current ranch time is
Mar 28, 2024 10:50:00.