• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

longest repeated substring - overlaping allowed

 
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello,
I have 1MB file with characters representing part of DNA. I need to find longest repeated substring (LRS) but these substrings can be over each other.

Example: BANANAS
1) LRS without overlapping: AN or NA
2) LRS with overlapping: ANA

I need to find the second one.

I wanted to use suffix tree. But if I understood at least a little to them, they allow me to find only the first mentioned LRS. Aren't they? Could anyone give me some direction, how to find the second one?
 
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Try using the KMP algorithm from 0 to len/2 for each substring. There might be a better way. I dunno.


Link to the algorithm's wikipedia entry: KMP Algorithm

Cheers,
[ December 05, 2008: Message edited by: Eduardo Burgos ]
 
Marshal
Posts: 79180
377
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
That looks like a useful link, Eduardo.

And welcome to JavaRanch, both of you
 
Antonin Danek
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thank you, maybe I will try it later.

I found out, that Suffix tree actually IS solution for this problem so because I already spent a day trying to construct Suffix tree in Java (unsuccessfully) I will try to finish it this way.
 
reply
    Bookmark Topic Watch Topic
  • New Topic