• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Rob Spoor
  • Tim Cooke
  • Junilu Lacar
Sheriffs:
  • Henry Wong
  • Liutauras Vilda
  • Jeanne Boyarsky
Saloon Keepers:
  • Jesse Silverman
  • Tim Holloway
  • Stephan van Hulst
  • Tim Moores
  • Carey Brown
Bartenders:
  • Al Hobbs
  • Mikalai Zaikin
  • Piet Souris

Regular Expression help

 
Ranch Hand
Posts: 331
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have a string that may contain a combination of urls and other information. All urls begin with http:// and if they separated by a blank if the string contains multiple items for example

String re = "test http://x.ca/abc/abcsurvey.studytaker.login this is test " +
"http://a.ca/abc/abcsurvey.adminteamweb.login my name is michael " +
"http://b.ca/abc/abcsurvey.localadmin.login ";

I would like to obtain it in this order
1/test
2/http://x.ca/abc/abcsurvey.studytaker.login
3/this is test
4/http://a.ca/abc/abcsurvey.adminteamweb.login
5/my name is michael
6/http://b.ca/abc/abcsurvey.localadmin.login

Pattern pattern = Pattern.compile("http://[^ ]+|.*? (?=http ");
Matcher matcher = pattern.matcher(s);
while(matcher.find()) {
System.out.println(matcher.group());

This works for some cases however I cannot get it to work for

String a = "test"

or

String b = "http://b.ca test";

or

String c = "test http://c.ca test";

Thank you all for your time. Sorry if I didnt post this in the right forum
 
Greenhorn
Posts: 3
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hmmm, maybe that's too slick for a simple regExp. If you're feeling academic, check out javaCC. You'll have your parser going in an hour.

if you must do it youself, implement your own recursive left to right parser.

A= "^http://[^]+ "
B= !A
while rest of string != ""
find A | B
chop found from rest of string
iterate

if the entries have to be of type AB(AB)* use semaphores acordingly.
 
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I think you shoud try using a pattern, once you obtain a partial result use another pattern to catch the other strings.
 
author
Posts: 23907
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


I think that you need a third case. The third case being a minimum match of a least one character bounded by the end of line.

Henry
 
Ranch Hand
Posts: 262
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
 
reply
    Bookmark Topic Watch Topic
  • New Topic