Win a copy of The Java Performance Companion this week in the Performance forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Best way to figure out overlapping of strings

 
Binu K Idicula
Ranch Hand
Posts: 99
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I have 2 Strings. Is there a regex way of finding out the overlapping part?


String str1 = "AB-CD";
String str2 = "CH-EF"

Now C is the overlapping part. I did it with tokenizer and with a custom algorithm, going one by one. But Is there a simple way out?

regards
Binu K Idicula
 
Eric Pascarello
author
Rancher
Posts: 15385
6
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Moving to another forum since this really more of a question that a Diversion. The Click Here link will take you to its new home.

Eric
 
Campbell Ritchie
Sheriff
Pie
Posts: 49447
62
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
That isn't really an overlap; if you had ABCD and CDEF then the overlap would be CD.

You can try putting one String into a regular expression inside [] and the seeing whether you can match the other. But then you would get ABCD matching FEDC on the DC. In the case you mention you would also have problems with the - because it is a special character.

There is lots and lots of information about regular expressions; the Java Tutorials is a good place to start.
 
Garrett Rowe
Ranch Hand
Posts: 1296
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Another important question would be what is considered overlap? If you have:

"abcde" and "cdfg" does the "cd" overlap?
"abcde" and "fbcg" does the "bc" overlap?
 
Bill Shirley
Ranch Hand
Posts: 457
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
are you asking for "greatest common substring" rather than an "overlap"?
 
Binu K Idicula
Ranch Hand
Posts: 99
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
"abcde" and "cdfg" does the "cd" overlap? - YES
"abcde" and "fbcg" does the "bc" overlap? - YES

are you asking for "greatest common substring" rather than an "overlap"? - YES, precisely.

Any thoughts?
 
Piet Verdriet
Ranch Hand
Posts: 266
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Wikipedia has a nice article about this:
http://en.wikipedia.org/wiki/Longest_common_substring_problem
 
Garrett Rowe
Ranch Hand
Posts: 1296
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Here's a link to the Wikipedia article on the subject. It contains a pseudocode algorithm that you might be able to adapt. It's a starting point if nothing else.


edit: 2 minutes too slow!
[ December 03, 2008: Message edited by: Garrett Rowe ]
 
Campbell Ritchie
Sheriff
Pie
Posts: 49447
62
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Garrett Rowe:
edit: 2 minutes too slow!


That happens to me all the time whenever Rob Prime is online!
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic