• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Paul Clapham
  • Tim Cooke
  • Devaka Cooray
Sheriffs:
  • Liutauras Vilda
  • paul wheaton
  • Rob Spoor
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Tim Holloway
  • Piet Souris
  • Mikalai Zaikin
Bartenders:
  • Carey Brown
  • Roland Mueller

Is this regular expression good enough to check http or https hyperlinks?

 
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Please refer to below regular expression

message = message.replaceAll("(?:https?|http?)://[\\w/%.\\-?&=!#]+(?!.*\\[/)",
"$0");

I also notice that certain links like http://www.google.com.sg/#hl=en&output=search&sclient=psy-ab&q=test&oq=&aq=&aqi=&aql=&gs_sm=&gs_upl=&gs_l=&psj=1&bav=on.2,or.r_gc.r_pw.r_qf.,cf.osb&fp=37e992000c3eb140&biw=1366&bih=638 is not able to convert to html hyperlink successfully.

Can anyone advise how to improve the regular expression?


 
Bartender
Posts: 10780
71
Hibernate Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Ronald Mee wrote:Can anyone advise how to improve the regular expression?


Well the start looks overdone to me:
"https?://"
should be sufficient if you just want http-based URLs; although there are other protocols.

As far as the rest is concerned, I don't know enough about URL rules to comment.

Winston
 
Ronald Mee
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks for input.

Can anyone assist to explain how does this regular expression work exactly?

message = message.replaceAll("(?:https?|http?)://[\\w/%.\\-?&=!#]+(?!.*\\[/)",
"$0");

And also how can i improve it to match the urls.

E.g. like how to make it match url like

http://www.google.com.sg/#hl=en&output=search&sclient=psy-ab&q=test&oq=&aq=&aqi=&aql=&gs_sm=&gs_upl=&gs_l=&psj=1&bav=on.2,or.r_gc.r_pw.r_qf.,cf.osb&fp=37e992000c3eb140&biw=1366&bih=638

Or url with special characters like !#&'()*+,-./:;=?@[]_~$
 
lowercase baba
Posts: 13091
67
Chrome Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

And also how can i improve it to match the urls.


What do you mean?

writing a program is as much about designing/deciding what you want to do as much as writing the code. providing an example does not make a spec.

If I said

"I want to write a program that generates a series of numbers. For example, it should print numbers like 1, 2..."

That's really not enough information to write code. Do I mean positive integers? powers of 2 greater than/equal to 1? the numbers on a clock face?

Why does Winston's suggestion of "https?://" not work? If you want to match the entire string up to a space, change it to something like "https?://.+[^ ]"
 
Ronald Mee
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi sorry if my reply is not very clear.

Basically i have content that is enter into a textarea like this post i am posting.

I want the java regular expresion to be able to autmatically detect the links and convert them to html hyperlinks which are clickable.

|http://naishe.blogspot.com|
|http://tw.com/#!/someTEXTs|
|http://ts123t1.rapi.com/#!download|13321|1313|fairy_tale.mp4|
|http://www.google.com|
|https://www.google.com|
|google.com|
|google.com|
|google.com/test|
|123.com/test|
|ex-ample.com|
|http://ex-ample.com/test-url_chars?param1=val1&;par2=val+with%20spaces|

as you can see alot of forums posts are able to do that. But i been having problem finding a regular expression that will work for all the urls.

Can advise further?

 
Ronald Mee
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
any people can share some advises?
 
author
Posts: 23956
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Ronald Mee wrote:any people can share some advises?



Well, let's take the regex, "(?:https?|http?)://[\\w/%.\\-?&=!#]+(?!.*\\[/)", and take a look at the components, shall we ???



(?:https?|http?) -- as already mentioned, this part will also match "htt", which isn't a valid protocol type -- see previous posts.

:// -- matches a colon followed by two slashes

[\\w/%.\\-?&=!#]+ -- matches one or more of any of characters on that list. IMO, I doubt that this is right, as there is no checking to see if the url is well formed, just checking to see if certain characters are used.

(?!.*\\[/) -- a negative lookahead past the url (zero or more characters away) for a square open bracket and forward slash. What is the purpose for this?

Henry
 
Winston Gutkowski
Bartender
Posts: 10780
71
Hibernate Eclipse IDE Ubuntu
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Ronald Mee wrote:any people can share some advises?


Yes. There is no substitute for research.

Unless you can find a precise definition of URL rules (and it seems to me that here might be a good place to start), you will never be able to create a regex for parsing a URL (or, indeed, verify that someone else's suggestion for one is correct).

As I said before, I don't know enough to comment on anything but the prefix.

Winston
 
Ronald Mee
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Is there any regular expression forum on java which i can post my query to for more relevant answers?
 
Henry Wong
author
Posts: 23956
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Ronald Mee wrote:Is there any regular expression forum on java which i can post my query to for more relevant answers?



Not sure what you are asking.... but (1) did you understand any of the issues raised in the topic about your regular expression, and addressed them? If you did, one option is to post your changes, and we can give you more hints. And (2) did you read the link provided by Winston, have a better understanding of what you want? If you do, then post those requirements here, and maybe we can give you hints towards what you want.

To be honest, your last five or six posts seems to not add anything to the conversation. If you want to advanced your solution, you have to show some effort, the ranch is NotACodeMill.

Henry
 
Seriously Rick? Seriously? You might as well just read this tiny ad:
We need your help - Coderanch server fundraiser
https://coderanch.com/wiki/782867/Coderanch-server-fundraiser
reply
    Bookmark Topic Watch Topic
  • New Topic