Win a copy of Zero to AI - A non-technical, hype-free guide to prospering in the AI era this week in the Artificial Intelligence and Machine Learning forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Liutauras Vilda
  • Paul Clapham
  • Bear Bibeault
  • Jeanne Boyarsky
Sheriffs:
  • Ron McLeod
  • Tim Cooke
  • Devaka Cooray
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Jj Roberts
  • Stephan van Hulst
  • Carey Brown
Bartenders:
  • salvin francis
  • Scott Selikoff
  • fred rosenberger

Better code for interview question list duplicate characters in a String

 
Rancher
Posts: 212
16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Here is the solution as I would write it for a coding interview:


Quick, easy, and clean (at least in my opinion)

Time complexity is O(n).
Justification: We loop through each letter in the String exactly once (and only perform O(1) operations on it, which is the HashSet's add). Therefore the algorithm is T(n) where n = length of String, in the best and worst case, therefore it is O(n).

Space complexity is O(n).
Justification is easiest through examples.

Take the worst case. String "aabbcc". This would result in duplicates holding "A", "B", "C", and allCharactersSet holding "A", "B", "C", therefore worst case is T(n * 2)
Take the best case. String "aaaaa". This would result in duplicates holding "A", and allCharactersSet holding "A", therefore best case is T(1)
Average case would be T(n), you can work out example "abc" to see why.

Because the worst case is n times a constant, this is linear, and therefore the space complexity is O(n).
 
Marshal
Posts: 70598
287
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Nice
I might change “letter” to “character” or “keystroke” throughout in that documentation comment, and I would also say that triplicates, quadruplicates, etc., are counted together with duplicates and not otherwise distinguished from them.
OP: Do you think ZG was correct to make that method static? Please justify your opinion.
OP: There are at least three differences between my suggestion and ZG's. I used a silly name for a Set. I didn't use toUpperCase(), so MS and ms wouldn't appear as duplicates. There is one more difference between our solutions. Did you notice what it is, and what difference that will make to any output?
 
Bartender
Posts: 2697
130
Google Web Toolkit Eclipse IDE Java
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I wanted to share.. this would have been my submission:
 
Master Rancher
Posts: 3684
43
  • Likes 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Nice use of distinct() there to simplify, Salvin.  Internally it ends up doing basically the same thing as a two-set solution, but more clearly and concisely.

My own 2-set solution was similar but without the distinct():

                   
There are many different ways we can organize this, trading off readability, reusability, concision, flexibility to requirement change, and other factors.  And a lot of "readability" depends on what you are used to.  I like the potenial reusability and flexibility of the map of character counts:
       
By the way I am a bit disappointed to see everyone else ignoring safe code point handling.    
 
You guys wanna see my fabulous new place? Or do you wanna look at this tiny ad?
Thread Boost feature
https://coderanch.com/t/674455/Thread-Boost-feature
reply
    Bookmark Topic Watch Topic
  • New Topic