• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Amortized analysis of extendable arrays.

 
Rancher
Posts: 1090
14
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

This is about 'An amortized analysis of extendable arrays' from the book 'Data Structures and Algorithms in Java' by Michael Goodrich and Roberto Tamassia. Here is the website - http://as.wiley.com/WileyCDA/WileyTitle/productCd-EHEP001656.html

First I'm not sure if I'm violating copyrights by asking my question here but the book does not have any copyright information. And also this book is used as a text book in many a computer science university courses, so I believe it should be fine. Correct me, if it is otherwise.

Ok, here is my question.

The current context is this - we are using a fixed capacity array to implement an ArrayList ( like the one we have in java.util.ArrayList ). When the capacity is reached, and an insert/add is done, a new array of double the size of the previous array is created and elements are copied into the new array. We are analyzing the performance of these add operations using these extendable arrays.

The authors' state the following ( copying as is ).

"As a shorthand notation, let us refer to the insertion of an element to be the last element in an array list as a push operation. Using an algorithmic design pattern called amortization, we can show that performing a sequence of such push operations on an array list implemented with an extendable array is actually quite efficient. To perform an amortized analysis, we use an accounting technique where we view the computer as a coin-operated appliance that requires the payment of one cyber-dollar for a constant amount of computing time. When an operation is executed, we should have enough cyber-dollars available in our current "bank account" to pay for that operation's running time. Thus, the total amount of cyber-dollars spent for any computation will be proportional to the total time spent on that computation. The beauty of using this analysis method is that we can overcharge some operations in order to save up cyber-dollars to pay for others."

I have understood the above parts. After this the authors' are proving the proposition as follows.

"Proposition 6.2: Let S be an array list implemented by means of an extendable array with initial length one. The total time to perform a series of n push operations in S, starting from S being empty is O(n).

Justification: Let us assume that one cyber-dollar is enough to pay for the execution of each push operation in S, excluding the time spent for growing the array. Also, let us assume that growing the array from size k to size 2k requires k cyber-dollars for the time spent copying the elements. We shall charge each push operation three cyber-dollars. Thus, we overcharge each push operation that does not cause an overflow by two cyber-dollars. Think of the two cyber-dollars profited in an insertion that does not grow the array as being "stored" at the element inserted. An overflow occurs when the array list S has 2i elements, for some integer i ≥ 0, and the size of the array used by the array list representing S is 2i. Thus, doubling the size of the array will require 2i cyber-dollars. Fortunately, these cyber-dollars can be found at the elements stored in cells 2i−1 through 2i − 1. (See Figure 6.4.) Note that the previous overflow occurred when the number of elements became larger than 2i−1 for the first time, and thus the cyber-dollars stored in cells 2i−1 through 2i − 1 were not previously spent. Therefore, we have a valid amortization scheme in which each operation is charged three cyber-dollars and all the computing time is paid for. That is, we can pay for the execution of n push operations using 3n cyber-dollars. In other words, the amortized running time of each push operation is O(1); hence, the total running time of n push operations is O(n)".

This is what I have not understood. The parts in bold ( they are my own edits ) specifically. Could you please help me understand what the authors are trying to say that we charge each push operation as one cyber dollar first, and then three cyber dollars and how are we saving two cyber dollars. I know I should ask this to the authors, but the publisher does not have 'ask a question' sort of a thing. Besides I believe many of you here wouldn't mind helping me on this one.

Thanks so much in advance for reading it through and for spending time on it ( I know it's quite big- sorry about that ). I've been trying to understand this for the last one hour - no go still.

Chan.
 
Bartender
Posts: 4179
22
IntelliJ IDE Python Java
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What they are saying is that the cost of inserting into the array is not constant. Adding to an array that has room left in it costs very little (1 dollar) while adding to an array that is filled costs lots (1 dollar * the number of elements in the array). When we say that adding to the array is a linear cost of O(n) we are saying that the total cost to add to the array is roughly equal throughout. It isn't quite true - we are essentially saying that the cost to add to an array that isn't full actually 'costs more' than it does, so that the cost to do additions that force growth can be charged the same. And this works out - because the high cost of adding growing the array is averaged over the elements which forced the growth to occur - each growth costs more, but has more cheap additions to pay for the extra cost.

This is not to say that that is what actually happens at run-time. It isn't. Costs are payed at the time they are required, no sooner, no later, and not spread out. This is just a way of describing an algorithm whose speed cost is sporadic, but averages out to be linear.
 
Chan Ag
Rancher
Posts: 1090
14
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks so much, Steve, for helping me.

I think it's all making sense now. We are not doing the resize as often as we are doing the adds. So we get a better performance ( let us say when compared to a LinkedList implementation for adding at the end of the list operations ) with most adds. Even if we consider the average add time with one resize and a few adds, the additional cost incurred because of array copying for a resize operation is evened out because of the faster adds. And as the array size grows, we start having fewer resize operations, so it might be more efficient than a linked list implementation for adding at the end of the list.

Have a good day.
Chan.






 
Steve Luke
Bartender
Posts: 4179
22
IntelliJ IDE Python Java
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Adding to an ArrayList isn't faster than adding to a LinkedList - it is probably slower. Adding to an ArrayList is O(n) while addition to a LinkedList, for any decent implementation, will be O(1) because there will be a reference to the end of the List and the LinkedList has 0 growth cost. What ArrayLists are faster at is accessing a random location in the list: O(1) while an LinkedList that would be O(n) (and same for inserting in the list, rather than adding at the end).

So if all you do is add to the end and iterate in order, a LinkedList is better. But if you need to randomly access then the ArrayList is probably better (assuming access happens more than adds).
 
Marshal
Posts: 79179
377
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Steve Luke wrote:Adding to an ArrayList isn't faster than adding to a LinkedList - it is probably slower. . . .

That is not what the docs for ArrayList say.

The array list runs in amortised constant time, but it says the time constant is short compared to linked list. As long as you are within the capacity of the List, adding to an array list means assigning the item to that position in the array. That requires no memory allocation and if it is later set to null or another item, there is no memory management (unless the item now becomes unreachable).
Adding to a linked list entails creating a Node instance, and removing that item (albeit much faster than removing from an array list) will (probably) make the Node unreachable and there might be memory management to be done later.
 
Bartender
Posts: 10780
71
Hibernate Eclipse IDE Ubuntu
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Steve Luke wrote:What ArrayLists are faster at is accessing a random location in the list: O(1) while an LinkedList that would be O(n) (and same for inserting in the list, rather than adding at the end).


I think we also need to make sure we're comparing apples with apples. List.add(element) always adds to the end so, in fact, it's O(1) in both cases (amortized in the case of ArrayList).

What is different is insertion (ie, List.add(index, element)), which is actually O(n)-ish for both cases: in the case of LinkedList because it has to find the position; in the case of ArrayList because it has to shift subsequent elements to the right (and possibly expand the array).

Where LinkedList does sometimes have an advantage is when insertion is done via an Iterator. If the Iterator is already correctly positioned, then insertion takes O(1) time, whereas with ArrayList it still has to shift remaining elements.

HIH

Winston
 
Chan Ag
Rancher
Posts: 1090
14
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thank you, all, for your inputs. They've been a great help.

Actually I should have specified some more details.

The authors' implementation of the ArrayList maintains two variables, int capacity ( the capacity of the internal array ), and int n ( the actual number of elements in the array ). Hence for adding at the end of this ArrayList, the amortized time complexity was O(1) though I understand why you've said it could be O(N) and why a LinkedList's add at the end could be faster. It would be O(N) if the implementation did not maintain the int n variable or if the add was at a specific index.

It's quite a subtle thing to remember and thanks for mentioning it.

Also today I went through LinkedList implementation ( the authors' implementation uses a Double Node - DNode class - and makes the DNode class implement a Position interface that aids in indexing the elements relative to each other ). So I understand now that even the constant time factor is more in case of a LinkedList. I am yet to go through the java.util.ArrayList and java.util.LinkedList code, so then perhaps I'd have more things to understand though this discussion has given me a background.

Thank you, all.
Chan.
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic