This is something you'll get used to working with the Collections classes pre-JDK 1.5: while they'll hold any Object you give them, they return a reference typed as an Object -- not whatever class you have put in it.Originally posted by Krit Christoforou:
However, this doesnt work
Which collection makes sense very much depends on what you need from it. Lists give you indexed access and allow duplicates, but what does it mean to have two or more duplicate words in different locations within the list? How do you distinguish duplicate words? Do you even need to have duplicates? Do you need indexed access (lookup by insertion order) or do you lookup the word by some other key?Originally posted by Krit Christoforou:
how do i store the dictionary in a treeset(for instance) and not lose my duplicate words?
Vector is the old ArrayList and works the same, so purge it from your brain.Originally posted by Krit Christoforou:
Well basically our lecturer told us that a hashset and a treeset are much quicker than an arraylist, a linked list or a vector, i dont know anything more than that.
From the problem description, I envision a whole lot of add() calls as the dictionary is read in, followed by a lot of lookup calls. For that I'd use a HashSet.As a programmer, which one would you choose if you had the same situation?
Originally posted by David Harkness:
However, you must allow duplicates, so I don't know what you're going to do other than make words arbitrarily different which makes absolutely no sense in this context. If I ask you to look up the word "BARK", will you return "BARK" #1 or "BARK" #2? How do you make the choice? As well, does the following list of minimal pairs make selse?
Okay, that's what I originally thought, and it makes far more sense now. You can use a Set to store words and pronunciations as there are no duplicates within either group.Originally posted by Krit Christoforou:
Oh yes, and it's got no word=word and pronunciation=pronunciation entries. Either the one is common with another entry, either the other. Because a word can have two pronunciations, or a pronunciation might be the same for two words. But the dictionary has no double entries, all entries have at least one difference, either in the word either in the pronunciation.
Yes, by append I mean "add at the end." In general, the two Lists will be faster at this than the Sets, but you'll eventually take a data structures class that will teach you how to evaluate this for specific cases yourself. It involves the concept "Big-O" notation which tells you roughly the scale of performance for a given operation.By "appending" do you mean adding data? So you're saying that a list is faster when it is being filled up?
Originally posted by David Harkness:
Do you have to provide any mechanism to perform lookup between the two?
In other words, does your dictionary allow you to find the pronunciations for a given word? If so, how will you handle it?
Originally posted by David Harkness:
The only problem I'm anticipating is if you used List.get(index) which doesn't exist in Set. Instead, you'll want to use an Iterator to visit each word/pronunciation in the dictionary.
Can you show me the context in which you're using List.get(index)? Let me give you an example. Say your task is to print out every word in the dictionary, and you used something like the following.In this case, you're already iterating over the entire collection, so you may as well use an Iterator, and you'd use the exact same code to iterate a Set as you'd use for a List.Note the empty "increment step" is not an error. Iterator.next() returns the next element and advances the iterator. If you need to use the value more than once, save it into a local variable.Originally posted by Krit Christoforou:
I did use List.get(index), and that's my problem now![]()
If you need random access (element 3, 8, 4, 10, 33, 25, ...) then Set may not be your bag (no pun intended). So the real question is, do you need to access elements from the collection other than by iterating over them all?[qb]
From the problem description, I don't see any cases where you'd need to look up a word by index number (other than iteration). Are there cases where you're looking up the Orthography that represents a String? For example:If that's the case, then what you really want to use is a Map. A Map holds pairs of keys and values, with the keys forming a Set since each key is unique. Map.get(key) returns the value mapped to that key if the key is in the Map. A List is like a Map where the keys are the ints from zero to the size of the List.Does that mean that every time i need to retrieve a word/pronunciation, i have to iterate through the dictionary until i find it?
Again, this is only necessary if you need to find the Orthography or Pronunciation given the String that you originally read in to represent it.I'll give you a taste of it and let you Google for more details. Any decent book on algorithms or data structures should describe it pretty well. Briefly, it measures how long it takes to perform a particular operation as a function of the size of the input (typically represented with "n").Oh yes, you mentioned something about the scale of performance of any given operation. Would i be able to understand anything if i looked at the "concept "Big-O" notation" or not?
Let's look at a linked list that maintains a reference to both the first (head) and last (tail) items it contains. No matter how many items the list contains, the add(item) operation always consists ofcreate new ListNode to store "item" point tail.next to the new ListNode point tail to the new ListNode
This is known as a constant-time operations since it doesn't depend on the size of the input. It is written as O(1). One thing to note about Big-O notation is that it is not an exact measure of the time. It simply expresses the scale of the operation time based on the input. Whether it takes 1 second, 10 seconds, or ten thousand years, if it doesn't depend on n, it's written O(1) -- "O one" or "constant time". This tells you that if you double n, the runtime isn't changed.
Now let's consider the get(index) operation. If index is 1, you just return the item at head. But for any other index you must start from head and follow the references until you get to index. If you double index, you double the runtime. Thus, the runtime is linearly dependent on index. This is written as O(n) -- "O n". Again, O(2n) and O(232,471,387n) are both linearly dependent and thus written as O(n).
You probably haven't written your own sort methods yet, but if you have the first one would probably have been bubble sort. This one requires comparing every single element to every other element. With n elements, that requires n*n (n^2) comparisons and is written O(n^2) -- "O n squared".
To give you an idea of how this can affect the choice of collection, consider the contains(item) operation for ArrayList, HashSet, and TreeSet. The following apply for ideal constructions of HashSet (sparse) and TreeSet (balanced), but would still be close for non-ideal cases.As you can see, hash-based collections are great for lookup: HashSet.contains(item) and HashMap.get(key).
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime. |