• Post Reply Bookmark Topic Watch Topic
  • New Topic

Need a data structure that has remove(int index) method that runs in constant time.  RSS feed

 
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have tried a LinkedList, ArrayList and also Vector. A Set will not work because I need to store duplicates. None of these have a method that can remove an element from the middle of a list in constant time. I also need this data structure to be able to change capacity. I also thought about using a Stack, Queue, or Deque but you can only remove/add from the front or back of these. Any ideas anyone?
 
Bartender
Posts: 1952
7
Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
ArrayList would be the closest thing to what you're looking for though. It can remove an element directly based on an index position, but the cost of shifting elements i.e. copying the backing array is incurred. That seems to be what you want though as you need the collection to change capacity dynamically.
 
lowercase baba
Bartender
Posts: 12565
49
Chrome Java Linux
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
is that constant time, or would it be linear? I assume it would take twice as long to copy twice as much data, but I can't say that for sure.
 
Jelle Klap
Bartender
Posts: 1952
7
Eclipse IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, the size of the backing array is always initially twice the size or some such as the number of element in the ArrayList, so until the ArrayList resizes the backing array as more element are added, copying the backing array would be sort of constant, I guess. It would be a matter of initializing the ArrayList to an appropriate large enough capacity to avoid resizing of the backing array.

Edit: err, never mind that's not really how the thing works.
 
Bartender
Posts: 4181
22
IntelliJ IDE Java Python
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
According to the API for ArrayList, remove() would be linear (probably because of the shifting the elements after the removed element down one index).
The API wrote:The size, isEmpty, get, set, iterator, and listIterator operations run in constant time. The add operation runs in amortized constant time, that is, adding n elements requires O(n) time. All of the other operations run in linear time (roughly speaking).

The remove() method is one of the 'all of the other operations...'

LinkedList would be linear because navigating to the index to be removed would be linear, while actually removing the element would be constant.
 
Sheriff
Posts: 22846
43
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You considered a Set, so I conclude that ordering isn't a requirement for this collection. (Is that correct?)

And that means that before you can remove an element from the collection, you first have to find the element. And in an ordinary list, finding an element is an O(N) operation where N is the size of the list. In other words, lists aren't what you are looking for.

It's only in keyed collections (Set, Map) that finding an element is an O(1) operation, so that's where you should be looking. The technical term for a Set which allows duplicate elements is "multiset". The standard API doesn't have a multiset implementation that I'm aware of, but I'm pretty sure you should be able to find other people's implementations with a quick web search.
 
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
William Koch wrote:I have tried a LinkedList, ArrayList and also Vector. A Set will not work because I need to store duplicates. None of these have a method that can remove an element from the middle of a list in constant time. I also need this data structure to be able to change capacity. I also thought about using a Stack, Queue, or Deque but you can only remove/add from the front or back of these. Any ideas anyone?

Yes. A skiplist. It doesn't give you O(1) performance (and actually, I don't think ANY structure that allows duplicates will), but it does give you O(log n) performance for all major updates, and in general is rather less "weighty" than a TreeMap. Unfortunately, like other suggestions, there is no List or MultiMap implementation in the Java collection API (why not, after all this time, one might ask?), but I suspect there are perfectly reasonable implementations out there; otherwise, rolling your own is pretty easy.

Winston
 
William Koch
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
HEY EVERYONE!

So essentially, I need to remove a random item from an ArrayList but it needs to be in constant time. Normally it is not constant because you need to shift all the elements in the list over to fill the gap after you remove from somewhere in the middle. BUT removing from the end of the list is constant time because no shift is required.

I found the solution. So essentially if I am going to have this work in constant time I can use an ArrayList BUT the secret is to replace the the item I am wanting to remove with the item at the end of the list. So I am calling remove() on the last index of the list every time but taking that value and replacing the value at the random index. I will just include an if statement to take into account the possibility the random index is the last one. Thanks for all your opinions and options.
 
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
That seems a simple and workable solution, as long as you don't care about the order. I've been assuming you do, but apparently not?
 
William Koch
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I apologize everyone. Jeff is correct, I do not care about order. I have created an interface called Pool<E> (i.e. a pool of people, NOT a swimming pool or billiards) so when I choose someone/thing from the pool it needs to be random therefore the order does not matter. I apologize to anyone who thought that it did. I should have been more specific. I will not make that mistake in the future.
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Bad on you for not making your requirements more clear, and bad on us (or on me at least) for filling in the blanks with assumptions.



Glad you found a simple and elegant solution though!
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
William Koch wrote:I apologize everyone. Jeff is correct, I do not care about order. I have created an interface called Pool<E> (i.e. a pool of people, NOT a swimming pool or billiards) so when I choose someone/thing from the pool it needs to be random therefore the order does not matter.

I believe that in some languages, the structure you're describing is called a Bag, because it mimics a bag of contents that you can't see.

Winston
 
Marshal
Posts: 56600
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Are you sure you can’t use a Set? I believe HashSet#remove() runs in O(1) time.
 
Campbell Ritchie
Marshal
Posts: 56600
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you are using a List, consider a combination of get(i) and set(i). Don’t allow null as a “real” value. If you set the removed element to null you will not get confused about the last element appearing several times. You will however have problems repeating the removal if get(i) returns null, so your performance will gradually degrade. When you have ≥ 50% of your List containing nulls, then you will on average have to repeat the random number generation on the majority of occasions, maybe several repeats. Random number generation is fast, but slower than the assigning of values in those ArrayList methods. It is worth resizing the List by copying all its non-null elements into a smaller List when it contains ≥ 33⅓% nulls. At ⅓ nulls, your multiple repeats will average out to double the number of random calls for 0% nulls.
You should find out about System#arraycopy(), which ArrayList uses internally to copy the array, and its performance, if possible. This may however be implementation‑dependent. You may be able to test it with the System method which looks for nanoseconds.
I doubt whether you will find a way of removing those elements which doesn’t run in linear time. Read what it says in the ArrayList documentation about it speed compared with LinkedList.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Campbell Ritchie wrote:If you are using a List, consider a combination of get(i) and set(i). Don’t allow null as a “real” value. If you set the removed element to null you will not get confused about the last element appearing several times.

Actually, if you do that, you'll also have to document it because you'll be changing the way a List is meant to work. What you'll have is something more akin to a dynamic array. It also makes iteration a bit more complex (and slower), unless you're happy to return "non-existent" entries.

@William: If you're not worried about maintaining order after a remove(), I reckon your solution is close to optimal. Just remember to document the difference in behaviour, particularly if your structure is going to implement the List interface. And try not to give away any "trade secrets" when you do so; all you need to say is something like 'element removal affects data order', you don't need to explain what you're doing.

It should also be added that unless you want to do something similar for add(n), that op will still require a copy to make room for the new element.

Winston
 
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Why not use a simple HashSet ? The elements are hashed so you can find the element to be removed in constant time. Removal is constant. No shifting needed.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Mk Chitti wrote:Why not use a simple HashSet ? The elements are hashed so you can find the element to be removed in constant time. Removal is constant. No shifting needed.

Because, as William already said, he needs to store duplicates.

Winston
 
Mk Chitti
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ok. Then why not use a hash table. Key is the object that needs to be stored. Value is the number of times it occurred. Same argument that I gave earlier applies here also. Removal is constant time.
 
Campbell Ritchie
Marshal
Posts: 56600
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote: . . .
Actually, if you do that, you'll also have to document it because you'll be changing the way a List is meant to work. What you'll have is something more akin to a dynamic array. . . .
I didn’t realise we were adding to the List too.
Isn’t moving the last element also working like a dynamic array?

It all goes to show what sort of convoluted code you need if you insist on such fast performance.
 
Campbell Ritchie
Marshal
Posts: 56600
172
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote: . . . What you'll have is something more akin to a dynamic array. . . . It should also be added that unless you want to do something similar for add(n), that op will still require a copy to make room for the new element.

Winston
I didn’t realise we needed to add as well. Agree it is more like a dynamic array, but isn’t copying the last element like a dynamic array, too.

It all goes to show what convoluted things you have to do to maintain such performance. And I shall shut up now.
 
William Koch
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
My interface only has 3 methods. add(), remove(), isEmpty(). All three of these methods need to run in constant time. I need to implement these in a class I created called RandomPool. (This pool is like a pool of candidates not a swimming pool or billiards) You randomly remove things from the pool but need to be able to hold duplicates and order does not matter. I can implement these methods in the RandomPool class using an ArrayList but the remove method for ArrayList on the Java API states that it takes linear time due to the shifts that need to take place if you remove anywhere in the middle or beginning. If you remove from the end of the list, no shifts are needed. So just replace the value you want to remove with the value on the end and delete the end. This takes constant time and can be done with the set() method. I have accomplished my tasks and it works and it is efficient. If anyone knows of any other ways to do this using stuff that already exist in Java API that takes constant time then please let me know so in the future I can use it.
 
author
Sheriff
Posts: 23295
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
William Koch wrote:My interface only has 3 methods. add(), remove(), isEmpty(). All three of these methods need to run in constant time. I need to implement these in a class I created called RandomPool. (This pool is like a pool of candidates not a swimming pool or billiards) You randomly remove things from the pool but need to be able to hold duplicates and order does not matter. I can implement these methods in the RandomPool class using an ArrayList but the remove method for ArrayList on the Java API states that it takes linear time due to the shifts that need to take place if you remove anywhere in the middle or beginning. If you remove from the end of the list, no shifts are needed. So just replace the value you want to remove with the value on the end and delete the end. This takes constant time and can be done with the set() method. I have accomplished my tasks and it works and it is efficient. If anyone knows of any other ways to do this using stuff that already exist in Java API that takes constant time then please let me know so in the future I can use it.



Just thought I should point this out. To remove the item, you need to actually find the item. Yes, copying the item from the end to the middle is constant time. Yes, truncating an item from the end can be constant time (not sure, how array list implemented this in this regard).... but how, exactly, are you locating this item in the middle to overwrite? Using an iterator is linear time. Using the contains() and indexof() methods of array list is linear time. I don't believe that it is possible to find the location of an item, in the middle of an unsorted list, in constant time.

Henry
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
William Koch wrote:My interface only has 3 methods. add(), remove(), isEmpty(). All three of these methods need to run in constant time. You randomly remove things from the pool but need to be able to hold duplicates and order does not matter. If anyone knows of any other ways to do this using stuff that already exist in Java API that takes constant time then please let me know so in the future I can use it.

Well, as I say, what you describe is generally called a Bag, and it can be easily implemented by wrapping an ArrayList (my implementation actually extends it, but it has more methods than you probably want).

And, since order isn't important, you only need to randomize at removal time. Elements can simply be added to the end of the List (the default for List.add()).

Winston
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Henry Wong wrote:To remove the item, you need to actually find the item.

Not required (I don't think). Everything William has described says that a call to remove() removes a random element, so it's equivalent to
myList.remove( Random.nextInt(myList.size()) ).

@William: Do shout if I've got this wrong.

Winston
 
William Koch
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
@Winston,

You are correct. The remove(int index) is choosing a random index to remove the element from, not searching for a specific item. I am not choosing a random item to remove but rather choosing a random index and removing whatever object happens to be in that index. But really I am not removing it but overwriting it with the item in the last index and then removing the last index all together.
 
William Koch
Ranch Hand
Posts: 76
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
After researching though I think a hash table could accomplish this same task but I have not used one before as I am fairly new to Java. I am a computer science major in college and up until this year I have been using Ada95 and Ada2005. I just switched to Java for the first time and just began taking a Data Structures class back in august. I have picked up the basics of Java pretty easily but am not use to having all these classes and interfaces available to work with.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
William Koch wrote:You are correct. The remove(int index) is choosing a random index to remove the element from, not searching for a specific item. I am not choosing a random item to remove but rather choosing a random index and removing whatever object happens to be in that index. But really I am not removing it but overwriting it with the item in the last index and then removing the last index all together.

Understood. Just remember that you should probably make remove(int) a private helper method for your remove().

I'm presuming, after going to all that trouble to make things random, that you don't actually want clients deciding which element they're going to remove.

Winston
 
Henry Wong
author
Sheriff
Posts: 23295
125
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
William Koch wrote:@Winston,

You are correct. The remove(int index) is choosing a random index to remove the element from, not searching for a specific item. I am not choosing a random item to remove but rather choosing a random index and removing whatever object happens to be in that index. But really I am not removing it but overwriting it with the item in the last index and then removing the last index all together.



These requirement doesn't make sense to me... Only add(), remove(), and isempty(). With remove(), you can't controll what to remove. Furthermore, there doesn't seem to be a way to actually fetch the data. With these requirements, you can get away with not actually storing the data -- the add() increments a count variable, the remove() decrements the count, and isempy() reports whether the count is zero.

Henry
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Henry Wong wrote:These requirement doesn't make sense to me... Only add(), remove(), and isempty(). With remove(), you can't control what to remove....

That's the whole point. A Bag represents a bag of objects, kind of like what you see when they draw teams represented by coloured balls for playoffs, or for a knockout tournament - or indeed the lottery. It can also be used to represent a deck of cards for a game, or any other set of objects that needs to be drawn at random.

Furthermore, there doesn't seem to be a way to actually fetch the data. With these requirements, you can get away with not actually storing the data...

Not quite sure what you mean here. Sure you could code it separately each time you need this sort of logic, but why not create a collection that supports the requirement? There's nothing to stop a Bag having an iterator which returns elements in random sequence (although in my implementation it simply returns the elements in whatever order they happen to be, and I have a shake() method that specifically randomizes the contents).

Does that help?

Winston
 
Steve Luke
Bartender
Posts: 4181
22
IntelliJ IDE Java Python
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Winston Gutkowski wrote:
Henry Wong wrote:...Furthermore, there doesn't seem to be a way to actually fetch the data.

... There's nothing to stop a Bag having an iterator which returns elements in random sequence.

Also, I assume the remove() method actually returns the object being removed from the bag (rather than just decrementing a value and returning void.
 
Winston Gutkowski
Bartender
Posts: 10575
66
Eclipse IDE Hibernate Ubuntu
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Steve Luke wrote:Also, I assume the remove() method actually returns the object being removed from the bag (rather than just decrementing a value and returning void.

Well, mine does; and it kind of goes along with the way other remove() methods work. Probably more important in its case too, since you don't know where the removed value came from.

Winston
 
Don't get me started about those stupid light bulbs.
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!