• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Tim Cooke
  • Campbell Ritchie
  • paul wheaton
  • Jeanne Boyarsky
  • Ron McLeod
Sheriffs:
  • Paul Clapham
  • Devaka Cooray
Saloon Keepers:
  • Tim Holloway
  • Carey Brown
  • Piet Souris
Bartenders:

methods, instances and arrays - swimming in the ocean of java

 
Ranch Hand
Posts: 46
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


which is not good, even for starters.
How can i do the same with an arraylist?

Here is the way i think my class will work, please tell me if that's right.

-main method creates two lists/sets.
-each member of these points to one of the classes Orthography/Pronunciation

Marc: Yep, i remember you clarified that

Thanks for your patience Dave, i wish our lecturers were as helpful as people in here. I have some experience, not much, but some experience with programming. Imagine that there are other people in my course that know much less than me, and have much more difficulty with english as well. I wonder how they will do this.



*UPDATE*
I think i got it, is it something that looks like

myArray.add(new Orthography()); ? or maybe not?

*UPDATE 2*

i tried




This seems to work, and it returns the toString of the instance.



This also works, with the same output.



However, this doesnt work


Why is that? Am i missing something?
[ March 07, 2005: Message edited by: Krit Christoforou ]
 
Ranch Hand
Posts: 1646
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Krit Christoforou:
However, this doesnt work

This is something you'll get used to working with the Collections classes pre-JDK 1.5: while they'll hold any Object you give them, they return a reference typed as an Object -- not whatever class you have put in it.

This is another important distinction. Every object instance has a class type (getClass() will return it), and it cannot change over the life of the instance. However, every object reference has a declared type which may not necessarily be the same as the instance to which it points. The reference must, however, either be the class, one of its superclasses, or one of its implemented interfaces.

In order to treat the object references returned from List.get(index) as an Orthography (to call any Orthography methods), you have to cast it to that type. If this violates the rules of the preceding paragraph, you'll get a ClassCastException.The reason you don't have to cast in order to call the toString() method is that the reference returned is declared as Object and Object defines toString().
 
Krit Christoforou
Ranch Hand
Posts: 46
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello again

Thanks to Dave my assignment is almost finished.
I'm saying almost because i need to use a faster way of storing stuff and not an ArrayList.
so my last(i hope) question, is how do i store the dictionary in a treeset(for instance) and not lose my duplicate words?
 
David Harkness
Ranch Hand
Posts: 1646
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Krit Christoforou:
how do i store the dictionary in a treeset(for instance) and not lose my duplicate words?

Which collection makes sense very much depends on what you need from it. Lists give you indexed access and allow duplicates, but what does it mean to have two or more duplicate words in different locations within the list? How do you distinguish duplicate words? Do you even need to have duplicates? Do you need indexed access (lookup by insertion order) or do you lookup the word by some other key?

When you say you need to speed it up, which operations are too slow: insertion, lookup, iteration? How are you using the collection?

If you want to store duplicates in a Set, you need to make them different. One way would be to add an arbitrary int value -- perhaps the order the words were inserted into the Set -- to the word class and use it in the equals() and hashCode() methods. With a Set, though, you lose indexed access.

If you can describe specifically what parts are slow, we can help. The trick is that each collection type provides different performance characteristics for the different operations so it greatly depends on how you use it.
 
Krit Christoforou
Ranch Hand
Posts: 46
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well basically our lecturer told us that a hashset and a treeset are much quicker than an arraylist, a linked list or a vector, i dont know anything more than that.and he also said that making a decision as to which one we will be using for the dictionary is important.
I am just fine with an Arraylist, i just think (from what the lecturer said) that a set would be faster(generally talking).
As a programmer, which one would you choose if you had the same situation?
 
David Harkness
Ranch Hand
Posts: 1646
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Krit Christoforou:
Well basically our lecturer told us that a hashset and a treeset are much quicker than an arraylist, a linked list or a vector, i dont know anything more than that.

Vector is the old ArrayList and works the same, so purge it from your brain. It really depends what you're doing.For example:
  • For appending, ArrayList and LinkedList will generally be faster than HashSet and TreeSet.
  • For lookup of objects by index, you must go with a List or suffer iterating through till you find the # you want. This makes no sense for HashSet but perhaps does for TreeSet.
  • If you want to find an object (using equals() method), HashSet is fastest, then TreeSet, then the Lists.
  • As you can see, your teacher is making a large generalization.

    As a programmer, which one would you choose if you had the same situation?

    From the problem description, I envision a whole lot of add() calls as the dictionary is read in, followed by a lot of lookup calls. For that I'd use a HashSet.

    However, you must allow duplicates, so I don't know what you're going to do other than make words arbitrarily different which makes absolutely no sense in this context. If I ask you to look up the word "BARK", will you return "BARK" #1 or "BARK" #2? How do you make the choice? As well, does the following list of minimal pairs make selse?
     
    Krit Christoforou
    Ranch Hand
    Posts: 46
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator

    Originally posted by David Harkness:
    However, you must allow duplicates, so I don't know what you're going to do other than make words arbitrarily different which makes absolutely no sense in this context. If I ask you to look up the word "BARK", will you return "BARK" #1 or "BARK" #2? How do you make the choice? As well, does the following list of minimal pairs make selse?



    Well It's supposed to return all matching results so it would return both of them. Oh yes, and it's got no word=word and pronunciation=pronunciation entries. Either the one is common with another entry, either the other. Because a word can have two pronunciations, or a pronunciation might be the same for two words. But the dictionary has no double entries, all entries have at least one difference, either in the word either in the pronunciation.

    By "appending" do you mean adding data? So you're saying that a list is faster when it is being filled up?

    I'm supposed to show the program to a demonstrator in a bit, but the actual deadline is on monday.I've still got they weekeng to change something that i dont like..

    Thanks
     
    David Harkness
    Ranch Hand
    Posts: 1646
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator

    Originally posted by Krit Christoforou:
    Oh yes, and it's got no word=word and pronunciation=pronunciation entries. Either the one is common with another entry, either the other. Because a word can have two pronunciations, or a pronunciation might be the same for two words. But the dictionary has no double entries, all entries have at least one difference, either in the word either in the pronunciation.

    Okay, that's what I originally thought, and it makes far more sense now. You can use a Set to store words and pronunciations as there are no duplicates within either group.

    What you have is a classic many-to-many relation between word and pronunciation. Each words has one or more pronunciation. Each pronunciation links back to one or more words. Do you have to provide any mechanism to perform lookup between the two? In other words, does your dictionary allow you to find the pronunciations for a given word? If so, how will you handle it?

    By "appending" do you mean adding data? So you're saying that a list is faster when it is being filled up?

    Yes, by append I mean "add at the end." In general, the two Lists will be faster at this than the Sets, but you'll eventually take a data structures class that will teach you how to evaluate this for specific cases yourself. It involves the concept "Big-O" notation which tells you roughly the scale of performance for a given operation.
     
    Krit Christoforou
    Ranch Hand
    Posts: 46
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator

    Originally posted by David Harkness:
    Do you have to provide any mechanism to perform lookup between the two?



    I already have a fully running program that can look up if a word contains/startswith/endswith a sequence of chars,same goes for a pronunciation, and print out the results.
    The only thing i want to change at this stage is to replace the two arraylists that store words/pronunciations, with sets.

    In other words, does your dictionary allow you to find the pronunciations for a given word? If so, how will you handle it?



    What do you mean does my dictionary allow me?

    Sorry for not explaining properly about double words and double pronunciations in my precious posts.
     
    David Harkness
    Ranch Hand
    Posts: 1646
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Ignore my question about performing lookup from word to pronunciation as the requirements you posted don't mention it.

    Replacing the two ArrayLists with HashSets should be relatively straight-forward. You'll need to override equals(Object) and hashCode(Object) in the Orthography and Pronunciation classes. Have you tried doing it? If so, have you had problems?

    The only problem I'm anticipating is if you used List.get(index) which doesn't exist in Set. Instead, you'll want to use an Iterator to visit each word/pronunciation in the dictionary. If you want the words/pronunciations to be sorted, you can use TreeSet.
     
    Krit Christoforou
    Ranch Hand
    Posts: 46
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator

    Originally posted by David Harkness:
    The only problem I'm anticipating is if you used List.get(index) which doesn't exist in Set. Instead, you'll want to use an Iterator to visit each word/pronunciation in the dictionary.



    I did use List.get(index), and that's my problem now
    I dont think there is any other problem in the process of replacing the arraylists with treesets.
    And that's what i can't understand, how do i retrieve data from a TreeSet, since there are not index numbers. i guess the solution is an iterator. Does that mean that every time i need to retrieve a word/pronunciation, i have to iterate through the dictionary until i find it?
    Wouldnt that require more processing, therefore be slower than accessing the data of an Arraylist?

    And as for TreeSets or HashSets, i think a TreeSet makes more sense in a dictionary.

    Oh yes, you mentioned something about the scale of performance of any given operation. Would i be able to understand anything if i looked at the "concept "Big-O" notation" or not?

    Thanks
     
    David Harkness
    Ranch Hand
    Posts: 1646
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    My plans were cut short. My loss is your gain.

    Originally posted by Krit Christoforou:
    I did use List.get(index), and that's my problem now

    Can you show me the context in which you're using List.get(index)? Let me give you an example. Say your task is to print out every word in the dictionary, and you used something like the following.In this case, you're already iterating over the entire collection, so you may as well use an Iterator, and you'd use the exact same code to iterate a Set as you'd use for a List.Note the empty "increment step" is not an error. Iterator.next() returns the next element and advances the iterator. If you need to use the value more than once, save it into a local variable.

    If you need random access (element 3, 8, 4, 10, 33, 25, ...) then Set may not be your bag (no pun intended). So the real question is, do you need to access elements from the collection other than by iterating over them all?[qb]

    Does that mean that every time i need to retrieve a word/pronunciation, i have to iterate through the dictionary until i find it?

    From the problem description, I don't see any cases where you'd need to look up a word by index number (other than iteration). Are there cases where you're looking up the Orthography that represents a String? For example:If that's the case, then what you really want to use is a Map. A Map holds pairs of keys and values, with the keys forming a Set since each key is unique. Map.get(key) returns the value mapped to that key if the key is in the Map. A List is like a Map where the keys are the ints from zero to the size of the List.

    Again, this is only necessary if you need to find the Orthography or Pronunciation given the String that you originally read in to represent it.

    Oh yes, you mentioned something about the scale of performance of any given operation. Would i be able to understand anything if i looked at the "concept "Big-O" notation" or not?

    I'll give you a taste of it and let you Google for more details. Any decent book on algorithms or data structures should describe it pretty well. Briefly, it measures how long it takes to perform a particular operation as a function of the size of the input (typically represented with "n").

    Let's look at a linked list that maintains a reference to both the first (head) and last (tail) items it contains. No matter how many items the list contains, the add(item) operation always consists of
  • create new ListNode to store "item"
  • point tail.next to the new ListNode
  • point tail to the new ListNode

  • This is known as a constant-time operations since it doesn't depend on the size of the input. It is written as O(1). One thing to note about Big-O notation is that it is not an exact measure of the time. It simply expresses the scale of the operation time based on the input. Whether it takes 1 second, 10 seconds, or ten thousand years, if it doesn't depend on n, it's written O(1) -- "O one" or "constant time". This tells you that if you double n, the runtime isn't changed.

    Now let's consider the get(index) operation. If index is 1, you just return the item at head. But for any other index you must start from head and follow the references until you get to index. If you double index, you double the runtime. Thus, the runtime is linearly dependent on index. This is written as O(n) -- "O n". Again, O(2n) and O(232,471,387n) are both linearly dependent and thus written as O(n).

    You probably haven't written your own sort methods yet, but if you have the first one would probably have been bubble sort. This one requires comparing every single element to every other element. With n elements, that requires n*n (n^2) comparisons and is written O(n^2) -- "O n squared".

    To give you an idea of how this can affect the choice of collection, consider the contains(item) operation for ArrayList, HashSet, and TreeSet. The following apply for ideal constructions of HashSet (sparse) and TreeSet (balanced), but would still be close for non-ideal cases.As you can see, hash-based collections are great for lookup: HashSet.contains(item) and HashMap.get(key).

     
    David Harkness
    Ranch Hand
    Posts: 1646
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    A few corrections to the above. Editing it would cut off two thirds of the post.

    The first part of the second quote is actually my text. The quote starts with the [qb].

    The number of comparisons that bubble sort must perform is not actually n^2. The first item is compared to every other item (n-1). The second round compares to every other item except the first (n-2). This continues until the end which requires zero comparisons.This is equivalent to the sum (1 + 2 + ... + n) - n [add 1 to each term and subtract n (1 times n terms)]. The sum from 1 to n - 1 can be rewritten asWhile this isn't exactly equal to n^2, the n^2 term overshadows the n term, and thus it's considered O(n^2). This comes back to the stipulation that we don't care so much about the exact time but rather the affect of modifying n (the input). Doubling n quadruples the runtime.

    Finally, the various collections have certain requirements for the objects you store. Using List.contains() requires that you override the equals() method. Hash-based collections (HashSet and HashMap) require that you override both equals() and hashCode(). Tree-based collections (TreeSet and TreeMap) require that you override equals() and implement the Comparable interface -- compareTo(). For the maps, you only have to do this for the keys, not the values to which the keys are mapped.

    As a (big) hint on how to implement these in your Othography and Pronunciation classes, note that String implements all three methods very well.
    [ March 12, 2005: Message edited by: David Harkness ]
     
    Krit Christoforou
    Ranch Hand
    Posts: 46
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Uh you're right, i was iterating through the whole dictionary anyway, so a treeset wouldn't do any extra work.
    I've been trying for some time to replace the arraylists with treesets but i cant do it. I get many errors, which i haven;t got the time to look through carefully and sort out, because tomorrow i must hand it in with documentation. So i'll have to get started with the documents and diagrams and stick with the arraylists.
    oh well i guess i wont get all the marks for correct data structures!

    thanks a lot david for all your help.
    I'll be back soon, because this project has two more stages

     
    With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
    reply
      Bookmark Topic Watch Topic
    • New Topic