I am trying to implement something like this

if (x,y) in datalist

add v to list pointed to by (x,y)

compute average of v's on the list

else

add a new (x,y) item and point it to v

One trick is x,y and v are doubles.

This is what I came up with-

I'm a bit concerned about having to loop through the list every time I do anything. I don't like there is no good return value for when the point does not have an average. I can add hasValue as a method, but then I'm looping through the list even more!

BTW the real application has another layer

(x,y)->z

z->v

for each (x,y) there may be several z's each z has a list of v's all with the same (x,y,z) values. I need to compute a bunch of things based on z and avg of v(z) for each fixed (x,y).

What do they represent and what do you need them for?

*The mind is a strange and wonderful thing. I'm not sure that it will ever be able to figure itself out, everything else, maybe. From the atom to the universe, everything, except itself.*

it may be that v is measured multiple times under the same conditions, in which case, one would use the average value of v.

z and v at a constant (x,y) can be used to derive the value w.

to obtain the value of w at values of (x,y) that were not measured, a regression equation can be computed w = f(x,y)

That equation is then used to predict w for an arbitrary (x,y).

So I broke that down into

a data structure containing

a dictionary keyed on (x,y)

regression coefficients

and methods to regress on (w,x,y)

and methods to predict w for a given (x,y)

(x,y) dictionary containing

a dictionary keyed on z

w

methods to compute and return w

(z) dictionary containing

a list of v for a given z

avg v

methods to compute and return the avg v

so I'd need to say, for example, add the element(x,y,z,v). If the (x,y) was seen before, then add(z,v) to the existing element, otherwise create a new one. Adding (z,v) means that if z was seen before (in a specific (x,y) context) then add v to the existing element, otherwise create a new one.

If I wanted a prediction for (x,y), if the regression parameters existed, I'd use them, otherwise, I would query each (x,y) for w, which it might need to compute. If it did, it would need to get the avg z's, which would be computed if not already available. The values are computed each time anything is added to the object, so everything should always be up to date.

Also, this is all needing to either be retrieved from or displayed in a table in a GUI.

Jon Swanson wrote:So I broke that down into

a data structure containing

a dictionary keyed on (x,y)

regression coefficients

and methods to regress on (w,x,y)...

Whoa there. My advice: slow down and take things one at a time.

Let's look at the first: "a dictionary keyed on (x,y)":

Anything with a "key" in it suggests a java.util.Map.

The questions you then need to ask yourself are:

1. (Which you've already answered yourself) Do you need fast access to a specific x,y value?

2. Do the Map keys need to be

*ordered*? If so, you'll need a SortedMap.

3. Do you need to get ranges of values for x,y? If so, you'll probably need a NavigableMap.

In general, the fastest type of Map for direct access to a key is a HashMap, but it is

*unordered*, so if the answer to either 2 or 3 is true, the normal choice is a TreeMap, although there is also ConcurrentSkipListMap.

The only other wrinkle you need to consider is this: Do you need

*more*than one value to be stored for any given key? If so, then each key needs to be associated with a java.util.Collection of some sort. If duplicate values are allowed for any given key, this should probably be a java.util.List; if not, it should be a java.util.Set.

It might also be worth mentioning that Java already has a class that does some of what you want, called Point2D.Double, so you could either extend it for your DataPoint class or (possibly better) have your class

*include*one. The nice thing about this is that the

`equals()`and

`hashCode()`methods (which you'll need if you want to put your DataPoint in a HashMap) are already defined, so you don't have to write them again.

HIH

Winston

"Leadership is nature's way of removing morons from the productive flow" - Dogbert

Articles by Winston can be found here

`x`and

`y`as its key and the

`v`, or the whole observation, as its value. That is probably easiest if you have a Condition class and have that as a field of the Observation class, rather than separate

`x`and

`y`. Obviously you can use a List of Observations as the value instead, in which case you add to the List in the Map.

Don’t use Dictionary; it is “legacy” code.

Your Observation or Condition class, whichever you use as a key, must have correctly-overridden equals(Object) and hashCode() methods.

If your Condition class implements the Comparable<Condition> interface or if you can create a Comparator<Condition>, you can use a SortedMap, which might make retrieval easier.

I hadn't come across Observation objects, that gives me something to pursue. I can define equals and comparator, less clear how to define hashCode().

I need to know if (x,y) is in the list I've created. I do need multiple items per object, I'm pretty comfortable with having a list of these.

I looked into Point2D.Double. but I will also have the case of (x,y,z) keys and Point3D.Double seemed to be missing a bunch of stuff Point2D.double implements and it seemed easier at that point to make my own class, since none of the geometry methods apply to my case.

Jon Swanson wrote:My concern is that unlike integers or strings, operations on doubles can result in round-off that makes them unequal. So I am used to comparing doubles if (|a - b| < threshold) abEqual = true; and I have trouble fitting that into my understanding of maps...

It's not really anything to do with Maps specifically; it has to do with whether two

`DataPoint`s are "equal" or not. In your case, I suspect that you need an

`equals()`method that returns true if two of your objects are "close enough", although that may be a bit tricky to dovetail with a

`hashCode()`method.

I think, if it were me, I would construct each of my '

`DataPoint`' objects (and you're right,

`Point2.Double`probably isn't appropriate in this case)

*with*a threshold, and have

`equals()`return true if, and only if,

*all*of the following are true:

1. Both objects have the same number of dimensions.

2. Both objects have the same threshold.

3. The difference between the values of each dimension is less than the threshold.

That means that:

(a) An x,y point

__cannot__be "equal to" an x,y,z point (although you could perhaps add methods to convert between the two).

(b)

`DataPoint`s with different thresholds can never be "equal" (if they could be, the function would not be symmetric).

But it

*does*allow you to create a

`hashCode()`for the object, by including the threshold value in the hash. You'd probably want to include the number of dimensions too.

HIH

Winston

"Leadership is nature's way of removing morons from the productive flow" - Dogbert

Articles by Winston can be found here

I need to display a table with 7 columns. As per suggestions, I won't worry about the table now. I think what matters is I need to come up with something that makes this process efficient-

Seven empty columns- a b c d e f g

User enters a b c d and optionally e

If e is entered, use it, otherwise calculate it by grouping the data in groups such that a, b, and c are all the same within a group and only d varies

e = f(d) and h = f'(d) within a group.

Now make new groups where a and b are the same within the group, but c varies. Note, the original values of d no longer matter, I am using e and f for which there is one value per (a,b,c)

f = f''(c,e,h) g = f'''(c,e,h)

Then calculate the final result. Note, the original values of c not longer matter, I am using f and g, for which there is one value per (a,b).

i = f''''(a,b,f,g)

The result of i is used to make a graph.

I tried to create a class heirarchy, so

class M

list of class A

class A

a

b

list of class B objects

f

g

class B

c

list of class C objects

e

h

class C

d

And at that led to this sort of logic

M.add(a,b,c,d[,e])

in M

if M has A with (a,b)

A.add(c,d[,e])

else

new A

compute i

in A

if A has B with c

B.add(d[,e])

else

new B

compute f and g

in B

C.add(d)

if not e compute e

compute h

Are we still talking about this type of structure, but making a change like-

class M

map with keys (a,b) of class A objects

class A

map with key (c) of class B objects

f

g

class B

List of class C objects

e

h

class C

d

then say

M.add(a,b,c,d[,e])

in M

if M.map(a,b)

map(a,b).add(c,d[,e])

else

map(a,b) = new A

compute i

in A

if A.map(c)

A.map(c).add(d[,e])

else

map(c) = new B

compute f and g

in B

C.add(d)

if not e compute e

compute h

or is something more extensive being suggested? If not it seems to be mostly figuring out how to use double precision numbers as keys to a map.

doubles as keys in hash maps

Am I also right that when I am looking to see if a DataPoint is in my map, I need to do something like:

That looks pretty clean, until I then need to see if the value, which would contain a map and data has an entry on its map.

I have several layers of data. In the end, I need to compute, store and retreive one (f,g) pair for each (a,b). I have a lot of (a,b). I have been trying to make this work-

It all seems very convoluted and since I need to add all sorts of getters and setters and updaters to make it really do something, I was hoping to get some comment that I am going down the right road and not missing some simpler straight-forward solution to handling the data. This information is retreived and displayed in a table. I'm becoming concerned I'll be able to wrap a table model around all this, as well.