Here's the question
Given a string, return recursively a "cleaned" string where adjacent chars that are the same have been reduced to a single char. So "yyzzza" yields "yza".
stringClean("yyzzza") → "yza"
stringClean("abbbcdd") → "abcd"
stringClean("Hello") → "Helo"
The iterative approach would be this.
The signature that I must stick to is public String stringClean(String str) and have to solve it without using replace() or replaceWith(). How should I approach writing my program now?
Ranajoy Saha wrote:Given a string, return recursively a "cleaned" string where adjacent chars that are the same have been reduced to a single char.
You need to think about more cases. Your iterative solution doesn't look right, while it was a nice try.
Ranajoy Saha wrote:The iterative approach would be this.
Liutauras would be cleaned to Liutars, not quite right, is it? While you actually want to remove same adjacent characters only. In the example I gave you, there were no adjacent same characters, but resulting string got shorter.
You need to rethink your current approach.
It doesn't seem you're returning something.
Ranajoy Saha wrote:Given a string, return ... a "cleaned" string...
However, I'd go with slightly different approach. I'm not saying you should discard yours - no, everyone's brain work differently (that is cool). Just keep in mind, that there could be more approaches to your problem and you need at least to try think about other possible ways of solving your problem, might one or another way can be better than other in some particular aspects or situations.
What you're doing here basically assembling string from scratch, character by character if that character is not the same as previous one. So, actually you're not cleaning string, but rather building from scratch with no adjacent duplicate characters.
The approach I just tried is looking for a duplicated adjacent characters and removing them, which in my opinion is closer to the string cleaning. Having said that, not necessarily my approach is better than yours, might even worse, but it is an alternative approach to yours. Think about it, which approach I could have in mind
That was one of the problems I saw.
Liutauras Vilda wrote:. . . It doesn't seem you're returning something. . . .
I would have used the method of copying single letters to something else myself. But I don't like the + operator or += on Strings in a loop, because of poor performance. There is a much faster way of building Strings dynamically than that.
(edit) Ah but I guess that won't help since you need to use recursion
Zachary Griggs wrote:Another slightly more unusual way you could do it is to create a Set of type character, then add every character from the string into the Set, then combine it back into a String. The point being, Sets do not allow duplicates. Perhaps not best performance-wise though.
Unfortunately, it also doesn't solve the problem, which is to remove consecutive duplicate characters; not all of them.
@Ranajoy: Sets certainly can be a good way to remove duplicates in general, so it might be worth remembeing Zachary's advice for the future.
Ranajoy Saha wrote:But how to reduce f("aab...") = f("ab..."), is my problem. I have one algorithm in mind, as we are using recursion, we need to start from back, so I shall check if someString.lenth()-1 == someString.length()-2 and if so, then f(someString(0,someString.length()-1)).
I think you'll find that that approach is simply the mirror-image of working from the front.
One of the main problems with recursion is that the problem needs to be defined in terms of itself, which is why many people (myself included) find it more difficult than normal logic. And for that reason it's even more important to follow Campbell's advice, and resist the urge to code.
Another tip: Even if your solution is recursive, it can still use methods, or "functions", that are not, And one useful function for solving your problem is to find the first duplicate consecutive character in a String. For example:And if you decide to tackle this problem from the back end, you can just rewrite it to work from the opposite end.
This code handles these cases well.
stringClean("yyzzza") → "yza"
stringClean("abbbcdd") → "abcd"
stringClean("Hello") → "Helo"
stringClean("XXabcYY") → "XabcY"
stringClean("112ab445") → "12ab45"
stringClean("Hello Bookkeeper") → "Helo Bokeper"
But, I'm still curious. I want to write this code without using a temp helper variable and also without using a method that will return me the first index of the duplicate character in the string. Is it possible?
Ranajoy Saha wrote:Solved it without using temp variable or a separate method...
Hmmm. Have you tried Junlu's tests on it?
I think you're on the right lines though, particularly with the "+" operator, because I think the overall problem can be described like this (if working from the "front"):
A "cleaned" String is either:
1. The String itself if it contains no consecutive duplicate characters.
2. The portion up to (but not including) the first duplicate character, plus a "cleaned" version of the suffix from (and including) the last consecutive duplicate.
So for that reason, you might want another "utility" method, viz:
Zachary Griggs Set, then I have not tried it yet because I'm not aware what Sets are. I need to do some research before I use them. I'm curious about your latest utility method, though. Will definitely use it and make another approach at this program. For now, thank you all, Campbell Ritchie, Zachary Griggs, Liutauras Vilda, Piet Souris (Thanks a lot for that cryptic hint), Jeanne Boyarsky, Paweł Baczyński, and last but not the least Winston Gutkowski, for helping me complete my assignment. Learned a lot about recursion, from you guys. Need to halt computer studies for the time being to focus more on Maths, Chemistry and Physics.
Ranajoy Saha wrote:I don't recall what Junlu's test's were. So, what are they?
The ones he posted in your other thread, which appears to be on the same subject (or something similar).
Basically, you want to come up with a set of tests which confirm that your code works in ALL cases.
Piet Souris wrote:these "utility" methods are ugly!
I agree about the second one - there's probably a more elegant solution - but not the first.
This problem is only actually a problem IF the String we're looking at contains "duplicates", so it makes sense to have a method that checks that for you.
And while I understand that your solution is theoretical, it's one of the cases where theory and practise diverge; because it could easily fail (with a StackOverflowError) on a relatively small String.
@Ranajoy: Which is why nobody in their right mind would ever solve this problem with recursion.
Which is not to say it isn't an enjoyable exercise.
Liutauras Vilda wrote:However, I'd go with slightly different approach.
Piet Souris wrote:yes, this is what I had in mind (actually, the slightly easier "front" variant:
And it is basically exactly the same logic what I got in my iterative solution yesterday, rather than assembling character by character. Can't share it as I don't have my machine with me.
Well done Ranajoy Saha.
Paweł Baczyński wrote:Which is why one can assume the problem is clearly theoretical which is why the theoretical solution is prefect as it is clear and short.
Fair enough. I still say it's too "theoretical" because it treats every duplicate (in fact, every character) the same.
That says to me that it's simply a recursive iteration that happens to check for two (and only two) duplicate characters, rather than a recursive algorithm for removing groups of them, which is the problem as I understood it from the original explanation.
As I said earlier, if the String - or remainder - doesn't contain any duplicates, there is no problem, yet Piet's code will recurse until it's exhausted.
Classic case of the "theoretical/real-life" divide; and there is no "right" answer.
The next obvious case as you build up to larger strings is where the length is 2. That means either the first character is a duplicate of the second character or it's a clean string. So, if you followed the test-first approach, you'd write tests in that sequence and implement code in the solution in that sequence.
The next test would be if the input string didn't have a duplicate at the start of the string, then had a duplicate at the next part of the string. That is, use "abb" as your test data.
All these tests would have eventually led you to something that was very similar to what has already been posted by others:
As with the other problem, I also have the condition, s.length() < 2), early on in the solution. This is more straightforward than the check ("".equals(s) || s.length() == 1). The problem with the latter expression is that it's not symmetrical. It would have been more symmetrical if you had written (s.length() == 0 || s.length() == 1) instead in which case it becomes obvious that it can be simplified to what I used, (s.length() < 2), the obvious case where the string clearly cannot have any duplicate characters.
The last two statements are typical of recursive solutions in that it takes the result of a trivial case and combines it with the result of applying the recursive function to the rest of the input.
On line 7, if the first two characters are duplicates, then you simply "drop" the first character and return the cleaned string starting with the second character.
On line 10, you are already guaranteed that the first character and second character are not the same (because that's the complement to the condition on line 7), so you return the first character (which is already guaranteed to be a non-duplicate) and add the result of recursively cleaning up the rest of the string.
Recapping: the usual approach to solving recursively is to start with the simplest, degenerate cases then build up to more general cases. The simple and degenerate cases will give you the conditions on which to stop the recursion. This allows you to build up slowly to the general cases. And again, you should have written unit tests along the way.
Winston Gutkowski wrote:Classic case of the "theoretical/real-life" divide; and there is no "right" answer.
And if anyone's interested, (and since Ranajoy seems to have a solution now), my "programmer's" solution, using my utility methods, is:Still think they're so terrible?
BTW, ("".equals(s) || ...) is a safer way to check than what you wrote. The expression (str.equals("") || ...) will still fail with a NullPointerException if str is null.
Edit: Even if you decided to have a design that returned null for null inputs, the check would be more straightforward if it were written as
Winston Gutkowski wrote:
I don't know about this one. I doubt I'd get to this code after I got the code that lined up with the unit tests. It takes me a little more effort to decipher what this is doing. The other way makes me think of finding a way to use tail recursion. This way makes me think of refactoring for clarity.
Ranajoy Saha wrote:BTW, ("".equals(s) || ...) is a safer way to check than what you wrote. The expression (str.equals("") || ...) will still fail with a NullPointerException if str is null.
I did not understand this part. Why is it so?
If the parameter, str, is null, then str.equals("") will try to invoke the equals() method on a null reference, resulting in a NullPointerException at runtime. On the other hand, "".equals(str) is safe because the empty String literal, "", is a String object and as such you can invoke equals() on it. ("".equals(null)) will always return false. If str is an empty string, then "".equals(str) will be true.