• Post Reply Bookmark Topic Watch Topic
  • New Topic

Remove duplicates from char array with O(n) complexity  RSS feed

 
Raj Kumar Bindal
Ranch Hand
Posts: 418
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I need to write a program to remove duplicates from a char array. The program should be with time complexity O(n).

Example, if word is independence ( as a character array), output should be = indepc

Please let me know how to do this.
 
Rob Spoor
Sheriff
Posts: 21131
87
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Check out LinkedHashSet; use java.lang.Character as the generic type.
 
Raj Kumar Bindal
Ranch Hand
Posts: 418
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Does LinkedHashSet ensures that the duplicates are removed with O(n) complexity?
 
Rob Spoor
Sheriff
Posts: 21131
87
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
LinkedHashSet has a (near) constant-time lookup, O(1). The looping and adding takes O(n). As a result, the total is O(n*1) == O(n).
 
Campbell Ritchie
Marshal
Posts: 56521
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Have you read its documentation?
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Raj Kumar Bindal wrote:I need to write a program to remove duplicates from a char array. The program should be with time complexity O(n).
Example, if word is independence ( as a character array), output should be = indepc
Please let me know how to do this.

Well a character is a numeric value, so you could create an array of booleans for all possible values; or indeed use a BitSet.

Winston
 
Rob Spoor
Sheriff
Posts: 21131
87
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The biggest problem with that is the number of possible values. While for bytes there are only 256 possible values, for chars that's 65536. With the current implementation of BitSet that requires just 1024 longs, using a boolean[] is probably going to be a lot more memory intensive.

But yeah, the BitSet solution would work. You need two loops (one for setting, one for retrieving) but since these loops will not be nested that's still O(n).
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Rob Spoor wrote:The biggest problem with that is the number of possible values. While for bytes there are only 256 possible values, for chars that's 65536. With the current implementation of BitSet that requires just 1024 longs, using a boolean[] is probably going to be a lot more memory intensive.

True, but if memory serves, a boolean is the same size as a byte; and what's 64K between friends these days?

But I agree, BitSet is probably the best for a combo of size and speed.

Winston
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
Rob Spoor wrote:The biggest problem with that is the number of possible values. While for bytes there are only 256 possible values, for chars that's 65536. With the current implementation of BitSet that requires just 1024 longs, using a boolean[] is probably going to be a lot more memory intensive.

True, but if memory serves, a boolean is the same size as a byte; and what's 64K between friends these days?


Actually, I think it's the size of an int, so 256k. But then again, I thought that that was just for individual booleans, and that boolean arrays were bit-packed.

Or maybe I'm just confused. :-)
 
Rob Spoor
Sheriff
Posts: 21131
87
Chrome Eclipse IDE Java Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I just thought of a major flaw of using (just) a BitSet or boolean[] - you don't preserve the ordering of chars. That's why I initially suggested a LinkedHashSet instead of a HashSet or TreeSet.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jeff Verdegan wrote:Actually, I think it's the size of an int, so 256k...

Actually, according to this blog (admittedly old), it's size isn't defined (and I couldn't see any reference to it in the JLS). Further down, someone seems to reckon it is the same size as a byte. This one suggests (as you said) that it actually uses different sizes for single values and arrays.

So, who knows? Anyway, 64/256/1024, it's all peanuts these days. God, I feel old.

Winston
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Rob Spoor wrote:I just thought of a major flaw of using (just) a BitSet or boolean[] - you don't preserve the ordering of chars.

Not really necessary surely? The ordering is implicit in the original array. My assumption is that the method would be something like:
public static final char[] removeDups(char[] chars) {...
and you're only usiing the array/BitSet to work out whether you've got a duplicate or not.

Winston
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
Rob Spoor wrote:I just thought of a major flaw of using (just) a BitSet or boolean[] - you don't preserve the ordering of chars.

Not really necessary surely?


Depends on the OP's requirements, which I don't believe he made clear in this regard. For a real-world case I wouldn't expect ordering to matter, but for homework, I wouldn't be at all surprised if that constraint were present.
 
dennis deems
Ranch Hand
Posts: 808
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jeff Verdegan wrote:
Winston Gutkowski wrote:
Rob Spoor wrote:I just thought of a major flaw of using (just) a BitSet or boolean[] - you don't preserve the ordering of chars.

Not really necessary surely?


Depends on the OP's requirements, which I don't believe he made clear in this regard. For a real-world case I wouldn't expect ordering to matter, but for homework, I wouldn't be at all surprised if that constraint were present.

Not only that; it strikes me as highly likely that working out the logic which filters out duplicates is the point of the exercise, rather than finding the Java API that does it for us.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dennis Deems wrote:Not only that; it strikes me as highly likely that working out the logic which filters out duplicates is the point of the exercise, rather than finding the Java API that does it for us.

Probably; which is why all we've offered is suggestions that might aid the process. Indeed, a boolean[] doesn't involve an API at all.

Winston
 
dennis deems
Ranch Hand
Posts: 808
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
Dennis Deems wrote:Not only that; it strikes me as highly likely that working out the logic which filters out duplicates is the point of the exercise, rather than finding the Java API that does it for us.

Probably; which is why all we've offered is suggestions that might aid the process. Indeed, a boolean[] doesn't involve an API at all.

Winston

I'm not sure whom you mean when you say "we", but I really thought I saw the idea of using a LinkedHashSet discussed here.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dennis Deems wrote:I'm not sure whom you mean when you say "we", but I really thought I saw the idea of using a LinkedHashSet discussed here...

I mean all of us who've offered advice. In this case, I think that a LinkedHashSet might be a bit OTT, but I say again that that all we have done is offer aids to the solution. How Raj decides to use them (or not) is up to him.

Winston

 
It is sorta covered in the JavaRanch Style Guide.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!