Win a copy of Murach's MySQL this week in the JDBC and Relational Databases forum!
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
• Campbell Ritchie
• Jeanne Boyarsky
• Ron McLeod
• Paul Clapham
• Liutauras Vilda
Sheriffs:
• paul wheaton
• Rob Spoor
• Devaka Cooray
Saloon Keepers:
• Stephan van Hulst
• Tim Holloway
• Carey Brown
• Frits Walraven
• Tim Moores
Bartenders:
• Mikalai Zaikin

# Need help to remove duplicate from a string using recursion

Bartender
Posts: 10780
71
• Number of slices to send:
Optional 'thank-you' note:

Junilu Lacar wrote:I don't know about this one. I doubt I'd get to this code after I got the code that lined up with the unit tests. It takes me a little more effort to decipher what this is doing. The other way makes me think of finding a way to use tail recursion. This way makes me think of refactoring for clarity.

OK, well given that it's a somewhat artificial problem, my approach is slightly different. Since one of the requirements is that the solution is recursive, and recursion doesn't come naturally to me, I prefer to avoid it if I can; so I tend to look for tasks that will work for either a normal or recursive solution, and two spring to mind:
1. Find a duplicate - if this function fails then the String is fine as it is.
2. Having found a duplicate, find the end of its "group".

Between the two of them, they allow you to remove all duplicates except the last (or first) of a group, leaving the pesudo-code:
clear(str) = hasNoDuplicates(str) ? str
: beforeDuplicates(str) + {first/last}Duplicate + clear( afterDuplicates(str) )

which is (basically) what my code does.

Don't know if it makes things any clearer, but it's the way my mind works.

Winston

Marshal
Posts: 8835
631
• 1
• Number of slices to send:
Optional 'thank-you' note:

Ruslan Salimych wrote:I've solved it recursively this way

That is partially recursive solution, one loop still is used.

However, that slightly violates the form recursion is being taught in general.

A recursive function normally has two parts:
1. Stopping condition. Processing the case or cases where the recursive function does not call itself.
2. Recursive step. Processing the case or cases where the recursive function calls itself.

Yours solution does not mimic that scenario. I'm not saying your solution is wrong, but worth to clarify such cases with instructor to be sure it is what expected.

Sheriff
Posts: 17644
300
• Number of slices to send:
Optional 'thank-you' note:

Winston Gutkowski wrote:
Don't know if it makes things any clearer, but it's the way my mind works.

It's clears up how your mind works ;)

It's always good to see different ways people think. Makes things more interesting. Besides, you tend to learn a lot more from seeing different points of view and different solutions to the same problem.

Greenhorn
Posts: 25
• Number of slices to send:
Optional 'thank-you' note:

Liutauras Vilda wrote:

Ruslan Salimych wrote:I've solved it recursively this way

That is partially recursive solution, one loop still is used.

Sorry, but what loop do you mean? "for" loop?

Junilu Lacar
Sheriff
Posts: 17644
300
• Number of slices to send:
Optional 'thank-you' note:
BTW, I refactored to a tail-recursive version:

Liutauras Vilda
Marshal
Posts: 8835
631
• Number of slices to send:
Optional 'thank-you' note:

Ruslan Salimych wrote:Sorry, but what loop do you mean? "for" loop?

You got only one there, so it must be "for"

Being said, that recursive solution can simplify and make solution more elegant, while it is probably true, there is an other medal side - not that easy to get confident with recursive solutions, many people struggle with it, when I had to solve my first problem in recursive way there were a task to rewrite few most popular sorting algorithms in recursive way without using loops at all, I was struggling for a while, but after I got it solved and I really liked it that way.
Anyway, mixing iterative way and recursive way it just adds extra difficulty and confusion to solution (at least for me). Either you think and write in recursive way or iterative, but thinking half/half doesn't give any profit (in terms of code elegance or possible dangerous mentioned by Winston earlier in this thread).

In Java probably you always stick with iterative solutions implementing algorithms. BUT it is not something to avoid completely. There are recursive data structures which you might use everyday.

Winston Gutkowski
Bartender
Posts: 10780
71
• Number of slices to send:
Optional 'thank-you' note:

Junilu Lacar wrote:BTW, I refactored to a tail-recursive version:

I rather like that one. And if 'acc' were a StringBuilder instead of a String, I suspect it would be quite quick too.

Winston

Junilu Lacar
Sheriff
Posts: 17644
300
• Number of slices to send:
Optional 'thank-you' note:
The name acc stands for accumulator, and yes, I suppose it could be a StringBuilder but that makes it mutable which is not really in line with the more functional nature it's supposed to have although I'm not even sure if the Java compiler makes optimizations for tail recursion or not. Will have to look that one up.

I guess it doesn't: http://www.drdobbs.com/jvm/tail-call-optimization-and-java/240167044

Greenhorn
Posts: 12
1
• Number of slices to send:
Optional 'thank-you' note:
Something like this:

Junilu Lacar
Sheriff
Posts: 17644
300
• Number of slices to send:
Optional 'thank-you' note:
@Chandler, I haven't tested it but I guess that works. It's not how a typical recursive solution is patterned though. Your solution keeps the original string intact and just keeps passing it along on the stack. The second parameter is an index that gets incremented with each recursive call and the third parameter c looks like it's something that you could actually calculate based on the second parameter. Bottom line: it works but it could be tighter.

Recursive algorithms usually break down the original input progressively into smaller pieces until it gets to some trivial or degenerate case at which point the recursion ends and "unwinds". See the solution and the other examples for tail call recursion in the Dr Dobb's article that I cited in my last two replies.

Chandler Deng
Greenhorn
Posts: 12
1
• Number of slices to send:
Optional 'thank-you' note:

Junilu Lacar wrote:@Chandler, I haven't tested it but I guess that works. It's not how a typical recursive solution is patterned though. Your solution keeps the original string intact and just keeps passing it along on the stack. The second parameter is an index that gets incremented with each recursive call and the third parameter c looks like it's something that you could actually calculate based on the second parameter. Bottom line: it works but it could be tighter.

Recursive algorithms usually break down the original input progressively into smaller pieces until it gets to some trivial or degenerate case at which point the recursion ends and "unwinds". See the solution and the other examples for tail call recursion in the Dr Dobb's article that I cited in my last two replies.

Ya I agree, can just use s.charAt(p - 1) to get the previous character. The reason that I didn't break the String is simply because of the efficiency (substring() is an O(n) operation).

Junilu Lacar
Sheriff
Posts: 17644
300
• Number of slices to send:
Optional 'thank-you' note:

Chandler Deng wrote:Ya I agree, can just use s.charAt(p - 1) to get the previous character. The reason that I didn't break the String is simply because of the efficiency (substring() is an O(n) operation).

What makes you think it's O(n)? If you study the implementation at http://www.docjar.com/html/api/java/lang/String.java.html carefully, you'll see that substring() is not O(n).

This is why it's this quote is often cited: "Premature optimization is the root of all evil" --Donald Knuth

Optimization is not something you should be guessing at nor should you be thinking about it too much until you have written clear, clean code.

Edit: I stand corrected. I guess you did base your optimization on known fact. Thanks and thanks to Pawel, too.

Bartender
Posts: 2236
63
• Number of slices to send:
Optional 'thank-you' note:
This implementation has changed in JDK 1.7.0_06.
Now it is O(n).

Chandler Deng
Greenhorn
Posts: 12
1
• Number of slices to send:
Optional 'thank-you' note:

Junilu Lacar wrote:

Chandler Deng wrote:Ya I agree, can just use s.charAt(p - 1) to get the previous character. The reason that I didn't break the String is simply because of the efficiency (substring() is an O(n) operation).

What makes you think it's O(n)? If you study the implementation at http://www.docjar.com/html/api/java/lang/String.java.html carefully, you'll see that substring() is not O(n).

This is why it's this quote is often cited: "Premature optimization is the root of all evil" --Donald Knuth

Optimization is not something you should be guessing at nor should you be thinking about it too much until you have written clear, clean code.

First, this String implementation is under package java.lang while the String implementation we usually use is the java.util one. Second, even though in the java.lang.String implementation, it's still an O(n) operation -- substring actually initialize a new string and call Arrays.copyOfRange() (this is an O(n) operation) in the initialization.

Clearly you don't need to break the string in each stack so why do it when it's not free?

Edit: I think you are right, I was looking in the wrong constructor. In that implementation substring is an constant time operation since it only change the start and end pointer but share the same char array.

Winston Gutkowski
Bartender
Posts: 10780
71
• Number of slices to send:
Optional 'thank-you' note:

Chandler Deng wrote:The reason that I didn't break the String is simply because of the efficiency (substring() is an O(n) operation)...

Erm ... actually, it isn't. Or rather, it needn't be if String itself contains a start and end index - ie, all Strings are actually "substrings".

TBH, I have no idea how String is implemented these days; but I seem to remember it contained something like it in the past.

Just one reason for not optimizing too soon.

 Dang! 32 minutes too late.

Winston

Chandler Deng
Greenhorn
Posts: 12
1
• Number of slices to send:
Optional 'thank-you' note:

Winston Gutkowski wrote:

Chandler Deng wrote:The reason that I didn't break the String is simply because of the efficiency (substring() is an O(n) operation)...

Erm ... actually, it isn't. Or rather, it needn't be if String itself contains a start and end index - ie, all Strings are actually "substrings".

TBH, I have no idea how String is implemented these days; but I seem to remember it contained something like it in the past.

Just one reason for not optimizing too soon.

 Dang! 32 minutes too late.

Winston

Aha, in contrast, I have no idea how String implemented at past, I begin to learn Java when it's Java 6 and at the time I paid attention to the implementation of those libraries it's already Java 7. Still, it's super cool to know the history and thanks for the information .

Junilu Lacar
Sheriff
Posts: 17644
300
• Number of slices to send:
Optional 'thank-you' note:

I haven't tested this yet but it incorporates the ideas that Winston and Chandler gave. It's based on the previous tail-recursive solution I gave only this doesn't use the String.substring() method. I suppose you could also use a char[] for the first parameter to the recursive method.

Bartender
Posts: 5464
212
• 1
• Number of slices to send:
Optional 'thank-you' note:
The solutions are getting uglier and uglier

So here's my attempt:

Paweł Baczyński
Bartender
Posts: 2236
63
• Number of slices to send:
Optional 'thank-you' note:

Piet Souris wrote:So here's my attempt

The method will fail for some inputs:Exception in thread "main" java.util.regex.PatternSyntaxException: Unclosed character class near index 3
[\]+

Piet Souris
Bartender
Posts: 5464
212
• Number of slices to send:
Optional 'thank-you' note:

It will be repaired in version 2.0

Paweł Baczyński
Bartender
Posts: 2236
63
• Number of slices to send:
Optional 'thank-you' note:
Streams way:

Winston Gutkowski
Bartender
Posts: 10780
71
• Number of slices to send:
Optional 'thank-you' note:

Piet Souris wrote:The solutions are getting uglier and uglier
So here's my attempt:

Ie, my solution with regexes.

Gave you a tick anyway.

Winston

Winston Gutkowski
Bartender
Posts: 10780
71
• 1
• Number of slices to send:
Optional 'thank-you' note:

Paweł Baczyński wrote:Streams way

Erm, doesn't distinct() remove ALL duplicate characters? We're only trying to remove consecutive duplicates (actually, all but one).

Winston

Paweł Baczyński
Bartender
Posts: 2236
63
• Number of slices to send:
Optional 'thank-you' note:

Winston Gutkowski wrote: Erm, doesn't distinct() remove ALL duplicate characters? We're only trying to remove consecutive duplicates (actually, all but one).

Yes, it does. I feel stupid

Piet Souris
Bartender
Posts: 5464
212
• Number of slices to send:
Optional 'thank-you' note:

Winston Gutkowski wrote:

Piet Souris wrote:The solutions are getting uglier and uglier
So here's my attempt:

Ie, my solution with regexes.

Gave you a tick anyway.

Winston

hi Winston,

I honestly edited my reply so that it contained ("Winstons solution") but that somehow did not make it into the reply. Sorry! But it was indeed your solution that reminded me of a regex!

Piet Souris
Bartender
Posts: 5464
212
• 1
• Number of slices to send:
Optional 'thank-you' note:

Paweł Baczyński wrote:

Winston Gutkowski wrote: Erm, doesn't distinct() remove ALL duplicate characters? We're only trying to remove consecutive duplicates (actually, all but one).

Yes, it does. I feel stupid

Put it on the buglist, for version 2.0

Winston Gutkowski
Bartender
Posts: 10780
71
• Number of slices to send:
Optional 'thank-you' note:

Paweł Baczyński wrote:Streams way:

I'm pretty sure that:will create an IntStream of non-repeating "characters"; but how best to reconstitute that back to a String, I'm not sure.

Also not sure if it needs a sequential() in there or not. I suspect it might.

Hopefully Stephen is somewhere around...

Winston

Winston Gutkowski
Bartender
Posts: 10780
71
• Number of slices to send:
Optional 'thank-you' note:

Piet Souris wrote:I honestly edited my reply so that it contained ("Winstons solution") but that somehow did not make it into the reply. Sorry! But it was indeed your solution that reminded me of a regex!

I believe you; thousands wouldn't.

Winston

Paweł Baczyński
Bartender
Posts: 2236
63
• Number of slices to send:
Optional 'thank-you' note:

Winston Gutkowski wrote:will create an IntStream of non-repeating "characters"; but how best to reconstitute that back to a String, I'm not sure.

The IntStream returned is already sequential so no need to make this explicit.

Also, you'll get an error with prev variable. A variable used in lambda must be final or effectively final.

This will work:Or this:
The array looks ugly. But it works.

Teamwork!

EDIT:
This collect looks better as it uses StringBuilder:

EDIT 2:
Even better:Wow! This is evolving!

Winston Gutkowski
Bartender
Posts: 10780
71
• Number of slices to send:
Optional 'thank-you' note:

Paweł Baczyński wrote:Also, you'll get an error with prev variable. A variable used in lambda must be final or effectively final.

Ahh. Cheers for that. Makes perfect sense.

The String "re-creation" however seems quite convoluted and potentially very expensive, since it appears to create a new String object for each character and then join them together. Is there not a more generic way of doing it?

I have to admit, THIS is one of the things I don't like about version 8: it gives you tons of ways to generate streams, but far fewer for turning them back into regular POJOs - and the ones they do seem quite "clunky".

Strings are about as basic as it gets, so it stands to reason that if you create an IntStream from one, you might well want to convert it back, so why not just supply a toString() or asString() method?

Winston

Winston Gutkowski
Bartender
Posts: 10780
71
• Number of slices to send:
Optional 'thank-you' note:

Paweł Baczyński wrote:EDIT 2:
Even better:Wow! This is evolving!

Yeah, that looks much more like what I was thinking about.

I still find that collect() bit an eyesore though, but there's probably not much you can do.

Well done!

Winston

Greenhorn
Posts: 1
• 1
• Number of slices to send:
Optional 'thank-you' note:
I found this solution:

Edit: simplified condional statement in recursiveStringClean()

Paweł Baczyński
Bartender
Posts: 2236
63
• Number of slices to send:
Optional 'thank-you' note:

Winston Gutkowski wrote:Strings are about as basic as it gets, so it stands to reason that if you create an IntStream from one, you might well want to convert it back, so why not just supply a toString() or asString() method?

What would you have IntStream.of(65, 98).asString() return? 6598? Ab?
String#chars returns IntStream but those can be used for thing unrelated to strings at all.

Maybe CharacterStream (or CharStream) wouldn't be a bad idea?

Winston Gutkowski
Bartender
Posts: 10780
71
• 1
• Number of slices to send:
Optional 'thank-you' note:

Winston Gutkowski wrote:Yeah, that looks much more like what I was thinking about....

Can you do:?

That would ensure that the StringBuilder has sufficient capacity at the start. but I suspect it's too "procedural" for functional bods.

Winston

Paweł Baczyński
Bartender
Posts: 2236
63
• Number of slices to send:
Optional 'thank-you' note:
This won't workIf you want to pass a parameter to the constructor you need to use

Winston Gutkowski
Bartender
Posts: 10780
71
• 1
• Number of slices to send:
Optional 'thank-you' note:

Paweł Baczyński wrote:What would you have IntStream.of(65, 98).asString() return? 6598? Ab?

The latter. And what it returns is a simple documentation issue.

String#chars returns IntStream but those can be used for thing unrelated to strings at all.

Not an IntStream returned by String.chars(), it can't.

Maybe CharacterStream (or CharStream) wouldn't be a bad idea?

Precisely. But now that chars() returns an IntStream, it can't be changed.

Too much too fast, IMO. They didn't think that one through - and now we're stuck with it for all eternity.

Winston

Winston Gutkowski
Bartender
Posts: 10780
71
• Number of slices to send:
Optional 'thank-you' note:

Paweł Baczyński wrote:If you want to pass a parameter to the constructor you need to use

Ah! Sweet. Have another cow.

Winston

Winston Gutkowski
Bartender
Posts: 10780
71
• 1
• Number of slices to send:
Optional 'thank-you' note:

Steffen Bauer wrote:I found this solution:]

Hi Steffen, and welcome to JavaRanch.

And thanks for getting us back on track.

Winston

Marshal
Posts: 79019
375
• Number of slices to send:
Optional 'thank-you' note:
Can you have a sensible result from toString() on a Stream? A Stream is an Object, so you can use the un‑overridden version of toString, but how are you going to print its contents? A Stream does not “have” any contents; it implements lazy execution and will not contain any contents until they are required “downstream”.
I agree that a CharStream might be useful; I suspect the three kinds of primitive Stream were chosen because longs, ints and doubles are the primitive types used for arithmetic. Maybe they thought that a CharStream would be used rarely. You can probably create a Stream<Character> from an IntStream like this:-I am not convinced that mapToObj is the best name for that method; maybe mapToObject would have been better.
You can't readily get a character Stream from the String#toCharArray method or anything like that. In fact once I started trying out Streams I started cursing the existence of arrays of primitives. Even Arrays#stream() cannot create a character Stream from a char[]. You cannot pass that as a T[] of any sort, nor an Object[]; it would at best be considered by a varargs method as a single Object and that would produce a 1‑element array.

If you get your hands on a Stream<Character>, you can call its distinct method, but that will not convert xxyyxx to xyx. It will convert xxyyxx to xy. You can try a side‑effect with peek() and filter with i -> i != iOld, but side‑effects seem like cheating to me.

Campbell Ritchie
Marshal
Posts: 79019
375
• Number of slices to send:
Optional 'thank-you' note:
I seem to have come late to this discussion and have repeated a lot of what has been said before.

 With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.