Hello, I have 1MB file with characters representing part of DNA. I need to find longest repeated substring (LRS) but these substrings can be over each other.
Example: BANANAS 1) LRS without overlapping: AN or NA 2) LRS with overlapping: ANA
I need to find the second one.
I wanted to use suffix tree. But if I understood at least a little to them, they allow me to find only the first mentioned LRS. Aren't they? Could anyone give me some direction, how to find the second one?
I found out, that Suffix tree actually IS solution for this problem so because I already spent a day trying to construct Suffix tree in Java (unsuccessfully) I will try to finish it this way.