Win a copy of liveProject: Protecting User Data with Spring Security and OAuth2 this week in the Spring forum!
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
• Campbell Ritchie
• Paul Clapham
• Ron McLeod
• paul wheaton
• Devaka Cooray
Sheriffs:
• Jeanne Boyarsky
• Tim Cooke
• Liutauras Vilda
Saloon Keepers:
• Tim Moores
• Tim Holloway
• Stephan van Hulst
• Carey Brown
• Piet Souris
Bartenders:
• salvin francis
• Mikalai Zaikin
• Himai Minh

# Find longest substring without repeating characters: logic fails in some cases

Ranch Hand
Posts: 1143
5
• Number of slices to send:
Optional 'thank-you' note:
Solving another leetcode problem:

Given a string, find the length of the longest substring without repeating characters.

Example 1:

Input: "abcabcbb"
Output: 3
Explanation: The answer is "abc", with the length of 3.
Example 2:

Input: "bbbbb"
Output: 1
Explanation: The answer is "b", with the length of 1.
Example 3:

Input: "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3.
Note that the answer must be a substring, "pwke" is a subsequence and not a substring.

Following is my code:

The logic fails when the string is "pwwkew". Looks like the issue is at line# 32. I reset the value of i to 0, but it gets increased by 1 before the next iteration because of 'i++'. Any suggestions?

Marshal
Posts: 72595
317
• Number of slices to send:
Optional 'thank-you' note:
Please start by explaining the logic of the method. Remember that maps can usually not distinguish repeats and the order they occur in. Would you want a map in the first place?
Why did you say something about arrays in your documentation comment when the parameter is typed String?

Campbell Ritchie
Marshal
Posts: 72595
317
• Number of slices to send:
Optional 'thank-you' note:
Have a possible solution, but I am keeping it hidden.

My JShell wrote:SubstringFinder.main( "abcabcbb",  "bbbbb", "pwwkew");
Longest substring = “abc”; length =    3
Longest substring = “b”; length =      1
Longest substring = “wke”; length =    3

Punit Jain
Ranch Hand
Posts: 1143
5
• Number of slices to send:
Optional 'thank-you' note:

Campbell Ritchie wrote:Please start by explaining the logic of the method.

The way I approached this is - iterate through each character of the string. Keep appending the character to the map as map keys. As soon as I reach to a character that's already added to the map, stop and get the length of the map and keep it into a temp variable.

Them remove the already visited characters fro the string and start the process again. If the length (before a repeating character arrives again) is greater than the earlier temp variable length, then replace the temp variable with this length.

Campbell Ritchie wrote:Why did you say something about arrays in your documentation comment when the parameter is typed String?

That's my mistake. Should be a String.

Punit Jain
Ranch Hand
Posts: 1143
5
• Number of slices to send:
Optional 'thank-you' note:

Campbell Ritchie wrote:Have a possible solution, but I am keeping it hidden.

My JShell wrote:SubstringFinder.main( "abcabcbb",  "bbbbb", "pwwkew");
Longest substring = “abc”; length =    3
Longest substring = “b”; length =      1
Longest substring = “wke”; length =    3

Yup, want to improve my logic building.

Campbell Ritchie
Marshal
Posts: 72595
317
• Number of slices to send:
Optional 'thank-you' note:

Punit Jain wrote:. . . iterate through each character of the string.

So far, so good

Keep appending the character to the map as map keys.

Why did you say “map”? Anyway, that is an implementation detail, which shouldn't be included in the logic.

As soon as I reach to a character that's already added to the map . . .

You mean when you encounter a character already recorded.

a temp variable.

Another implementation detail.

. . . remove the already visited characters . . .

Please explain what you mean about removal? Why don't you simply start from the next character? What does line 21 do?

If the length . . . is greater than the earlier . . . length . . ..

That sounds all right.

I think you are shortening the text from its beginning wrongly.

Sheriff
Posts: 7959
560
• Number of slices to send:
Optional 'thank-you' note:

Punit Jain wrote:Note that the answer must be a substring, "pwke" is a subsequence and not a substring.

Apart from somewhere incorrect logic what Campbell already mentioned to you, your solution doesn't seem to adhere to the requirements you posted. I've bolded that part.

Punit Jain
Ranch Hand
Posts: 1143
5
• Number of slices to send:
Optional 'thank-you' note:

Campbell Ritchie wrote:

Punit Jain wrote:Keep appending the character to the map as map keys.

Why did you say “map”? Anyway, that is an implementation detail, which shouldn't be included in the logic.

I thought to use a map because the keys in map should be unique. When the map encounters a repeating key, it replaces the value of the same existing key. Based on the problem, this way I will know that I need to break/pause once I encounter the same key and store the length to a variable. Another reason I chose a map so that I get o(n).

Campbell Ritchie wrote:

Punit Jain wrote:As soon as I reach to a character that's already added to the map . . .

You mean when you encounter a character already recorded.

Yes, thanks for the correction.

Campbell Ritchie wrote:

Punit Jain wrote:a temp variable.

Another implementation detail.

You mean, I am mixing up logic with the implementation details? I should first build the logic without selecting the specific data structures and then the implementation details (i.e. selecting the data structures)?

Campbell Ritchie wrote:

Punit Jain wrote:. . . remove the already visited characters . . .

Please explain what you mean about removal? Why don't you simply start from the next character? What does line 21 do?

By removal - I meant shortening the string by taking out the already visited string from the original string and start again from index 0. Yes, I can also start with the next character and that would be better as I save the call to substring(). I assume you meant line# 31 then it removes (substring) the already visited string from the actual string and then the next line is to reset the index to 0. Like the following:

Campbell Ritchie wrote:I think you are shortening the text from its beginning wrongly.

Did you mean this? s = s.substring(currentSize, s.length() - 1);

Punit Jain
Ranch Hand
Posts: 1143
5
• Number of slices to send:
Optional 'thank-you' note:

Liutauras Vilda wrote:

Punit Jain wrote:Note that the answer must be a substring, "pwke" is a subsequence and not a substring.

Apart from somewhere incorrect logic what Campbell already mentioned to you, your solution doesn't seem to adhere to the requirements you posted. I've bolded that part.

As per the first line "Given a string, find the length of the longest substring without repeating characters." Also, the return type in the method signature on leetcode is int. So I think that they are expecting the length as I can't change the method signatures.

By this -  "pwke" is a subsequence and not a substring - I think they meant that it should be the length of substring and not a subsequence. But yeah, they should have added the length word in the last line to avoid confusion.

Liutauras Vilda
Sheriff
Posts: 7959
560
• Number of slices to send:
Optional 'thank-you' note:

Punit Jain wrote:

Campbell Ritchie wrote:I think you are shortening the text from its beginning wrongly.

Did you mean this? s = s.substring(currentSize, s.length() - 1);

What Campbell is saying, that all you want is to start from further character every other iteration. i.e.:

1st iteration
pwwkew

2nd iteration
pwwkew

3rd iteration
pwwkew

...

Is that what your string shortening achieves?

expected:
pwwkew
wwkew
wkew
kew
...

Punit Jain
Ranch Hand
Posts: 1143
5
• Number of slices to send:
Optional 'thank-you' note:

Liutauras Vilda wrote:
Is that what your string shortening achieves?

expected:
pwwkew
wwkew
wkew
kew
...

Got it. Here is the updated code:

Campbell Ritchie
Marshal
Posts: 72595
317
• Number of slices to send:
Optional 'thank-you' note:

Punit Jain wrote:. . . the keys in map should be unique. . . . .

But what about the values? What are you doing with the values? Hint: nothing. So why are you using a map? Why don't you use other data structures/types?

Another reason I chose a map so that I get o(n).

You mean O(1) time complexity? There are other data structures giving O(1) time complexity too. And some maps run in O(logn) time complexity. That is for the put() method, at least.

. . . You mean, I am mixing up logic with the implementation details? I should first build the logic without selecting the specific data structures and then the implementation details . . .

. . . I meant shortening the string by taking out the already visited string from the original string and start again from index 0. . . .

That is what I thought you were doing wrong, but you aren't even doing that. You are doing something different, which is still wrong, and which Liutauras has told you more about already.

My JShell wrote:jshell> LongestSubstringWithoutRepeating.main("abcabcbb",  "bbbbb", "pwwkew");
3
1
3

My JShell wrote:jshell> LongestSubstringWithoutRepeating.main("abcabcbb",  "bbbbb", "pwwkew", "Campbell Ritchie");
3
1
3
12

That ain't right, I am afraid. "Campbell Ritchie" shouldn't go more than 7 without finding the duplicated “l”.

Now you know what I meant about logic and implementation details, I suggest you go back to the drawing board and get it right. In one stage, not like this (longer and maybe less witty version here). Try some Rubber Duck Programming and explain to the rubber duck how that loop works. I recommend several pints beforehand

Punit Jain
Ranch Hand
Posts: 1143
5
• Number of slices to send:
Optional 'thank-you' note:

Campbell Ritchie wrote:But what about the values? What are you doing with the values? Hint: nothing. So why are you using a map? Why don't you use other data structures/types?

Yes, I should use a different data structure that takes unique values.

Campbell Ritchie wrote:You mean O(1) time complexity? There are other data structures giving O(1) time complexity too. And some maps run in O(logn) time complexity. That is for the put() method, at least.

Actually, I meant that I could do it in just one loop.

Punit Jain
Ranch Hand
Posts: 1143
5
• Number of slices to send:
Optional 'thank-you' note:

Campbell Ritchie wrote:But what about the values? What are you doing with the values? Hint: nothing. So why are you using a map? Why don't you use other data structures/types?

Yes, I should use a different data structure that takes unique values.

Campbell Ritchie wrote:You mean O(1) time complexity? There are other data structures giving O(1) time complexity too. And some maps run in O(logn) time complexity. That is for the put() method, at least.

Actually, I meant that I could do it in just one loop.

Punit Jain
Ranch Hand
Posts: 1143
5
• Number of slices to send:
Optional 'thank-you' note:
So here is how I am doing it now:

Approach

1. Iterate through the characters in the string.
2. Keep adding the characters to a data structure that accepts unique values.
3. Store the current length of the data structure in memory.
4. Start removing the values from the string as soon as it encounters a duplicate character in the string.
5. Perform step#1 to 4 until it reaches the end of the string.

Implementation

Iterate through the characters in the string and add them to a HashSet until a repetitive character is found. Keep storing the current length in an integer variable. As it encounters a unique character, start removing from the HashSet while maintaining a remove counter variable.

Code

Campbell Ritchie
Marshal
Posts: 72595
317
• 1
• Number of slices to send:
Optional 'thank-you' note:

Punit Jain wrote:. . .
2. Keep adding the characters to a data structure that accepts unique values.

Don't say “data structure”; you are still at the design stage where you are better off saying, “Keep adding the characters to something....” Rather than, “unique values,” say, “only unique values,” or, ꀜdoesn't accept duplicates.”

3. Store the current length of the data structure in memory.

You have again let an implementation detail slip in. Say, “record th count of individual characters.”

4. Start removing the values from the string as soon as it encounters a duplicate character in the string.

That sounds like the bit you got wrong before. You need to correct that bit.

5. Perform step#1 to 4 until it reaches the end of the string.

“String” is an implementation word. Say, “text,” or similar instead.

. . . . add them to a HashSet until a repetitive character is found.

I used a Set too.

Keep storing the current length in an integer variable. . . . remove counter variable.

Why do you need these counters when you can use Set#size() instead? Please explain how your loop works; I don't understand the logic. I cannot however break your code. Why are you accepting null or empty Strings?

Saloon Keeper
Posts: 4377
163
• Number of slices to send:
Optional 'thank-you' note:
@Punit
well done, and have a cow!

An extra challenge for you: can you extend your method such that it also gives that longest substring, and its starting index?

Punit Jain
Ranch Hand
Posts: 1143
5
• Number of slices to send:
Optional 'thank-you' note:

Piet Souris wrote:@Punit
well done, and have a cow!

thanks so much.

Piet Souris wrote:
An extra challenge for you: can you extend your method such that it also gives that longest substring, and its starting index?

Punit Jain
Ranch Hand
Posts: 1143
5
• Number of slices to send:
Optional 'thank-you' note:

Campbell Ritchie wrote:Why do you need these counters when you can use Set#size() instead? Please explain how your loop works; I don't understand the logic. I cannot however break your code. Why are you accepting null or empty Strings?

Let's say the input is: "abcabcbb"

It will go into the if condition till the 2nd index. The character at the third index is already present in the set hence it will go into the else block. It will start removing the characters from the set until it encounters that repetitive character (the current value of index 'i'). It will then increment the removevaluecounter and break the while loop. At this point, the set will have only that repetitive character (the current value of index I). It will resume the for loop.

I think the empty string should be size 0, but I can skip the null part.

Piet Souris
Saloon Keeper
Posts: 4377
163
• Number of slices to send:
Optional 'thank-you' note:

Punit Jain wrote:

Hmm, not good, I'm afraid. For the inputstring "dcbaa" I get this output: {[a]=4}.

But since your previous code does the required job, I would call it a day. Thanks, it was entertaining!

Liutauras Vilda
Sheriff
Posts: 7959
560
• 1
• Number of slices to send:
Optional 'thank-you' note:
Cowgratulations, your post has been published in CodeRanch's August 2020 journal.
Staff note (Liutauras Vilda) :

@OP

Punit Jain
Ranch Hand
Posts: 1143
5
• Number of slices to send:
Optional 'thank-you' note:

Liutauras Vilda wrote:Cowgratulations, your post has been published in CodeRanch's August 2020 journal.

Thank you.

Marshal
Posts: 26519
81
• Number of slices to send:
Optional 'thank-you' note:
I just finished writing a program which goes through a GPS track, which is basically a list of locations and timestamps, recorded once per second. The purpose of the program was to look at subsequences of track points which cover 1 km or more, and find one with the shortest elapsed time. In other words, in this run how fast was the fastest kilometer that I ran?

So I solved that by having a window travel from start to finish through the track, making the window be at least 1 km wide but not more than one track point past that, and then observing the minimum time that the time-frame the window covers. And that technique works for this problem, without needing a Map to keep track of stuff. There's a lot of substringing going on instead but I find it easier to read. This may be because I'm the one who wrote it, but anyway here it is: