The key cannot be the recNo
It must be some object representing the client, which is not natural and couterperformant.
When some caller claims for a lock, it has to check in a loop if the given recNo is locked or not (waiting if true, returning if false) ... while the map against which this check must be performed offers no optimal way to do it (containsValue() vs containsKey()). As a matter of fact, the key (which is the fast-access data of the map, is even never queried). You'll agree
with me that it is, by design, a quite bad example of a use of the Map interface.
Actually, the key can be the record number, depending on your implantation. However, that's beside the point. I disagree with your assertion that this is 'not natural', and I further disagree that Data does not represent a client.
Each Data is uniquely assigned to a client, per your stated assumption, and thus each Data does, in fact, uniquely represent a client. This is similar to a pattern used by, say, cell phones. When you buy your cell phone, you receive an Object that is unique to you: the phone itself. The system in place, the phone company, doesn't track Phil: they track the object they used to uniquely identify you as a part of their system: namely, your phone, and they indirectly tie that object to you by the protocols of usage. That is, they assume that when a call is made, it's made by the client(Phil). They base their entire business model on it. This is exactly what is happening here. We are assigning a unique object to each client( namely, the Data object), and we are tracking that unique object.
As to your first point: would you mind spelling out for me what 'couterperformant' means to you? I don't want to start debating a strawman
Originally posted by Philippe Maquet:
Hi Max,
quote:
--------------------------------------------------------------------------------
Actually, the key can be the record number, depending on your implantation. However, that's beside the point. I disagree with your assertion that this is 'not natural', and I further disagree that Data does not represent a client.
--------------------------------------------------------------------------------
I think you misunderstood me (my English is quite poor). Of course Data may represent the client, as it is unique per client. What I feel "not natural" with the WeakHashMap solution is that you must store the Data instance as a key instead of a value. WeakHashMap stores its keys as WeakReferences, right ? So I don't understand how you could invert, with a WeakHashMap implementation, the key/value pair (I mean storing recNos as keys and Data instances as values).
quote:
--------------------------------------------------------------------------------
Each Data is uniquely assigned to a client, per your stated assumption, and thus each Data does, in fact, uniquely represent a client. This is similar to a pattern used by, say, cell phones. When you buy your cell phone, you receive an Object that is unique to you: the phone itself. The system in place, the phone company, doesn't track Phil: they track the object they used to uniquely identify you as a part of their system: namely, your phone, and they indirectly tie that object to you by the protocols of usage. That is, they assume that when a call is made, it's made by the client(Phil). They base their entire business model on it. This is exactly what is happening here. We are assigning a unique object to each client( namely, the Data object), and we are tracking that unique object.
--------------------------------------------------------------------------------
I agree at 100% (see above).
quote:
--------------------------------------------------------------------------------
As to your first point: would you mind spelling out for me what 'couterperformant' means to you? I don't want to start debating a strawman
--------------------------------------------------------------------------------
Sorry, but I don't understand your sentence I don't know what a strawman is (and I don't find it in my dictionary).
I see that I wrote 'couterperformant' instead of 'counterperformant', while the latter, even "corrected", is maybe not an english word : just a free translation of "contre-performant" in french.
What I meant is simply this : Map.containsValue() is less efficient than Map.containsKey().
"I'm not back." - Bill Harding, Twister
Originally posted by Jim Yingst:
[PM]: What I meant is simply this : Map.containsValue() is less efficient than Map.containsKey().
[MH]: Why would you say that?
Maps, Hash- or Tree-, are designed to look thinks up by their keys, and quickly. Implementing contains() is pretty similar to just doing a get() and seeing if it's null or not. For a HashMap, you just calculate the hashCode(), look in the appropriate bucket, and maybe iterate through a reasonably small number of entries that are in that same bucket. It's O(1). For a TreeMap it's O(log(n)) but still pretty quick. In contrast, for containsValue() you have no choice but to iterate through every single entry in the Map, and see if the value matches your target. Best-case scenario you get lucky and find your target early on, and you can abort the loop, but don't count on it. It's an O(n) operation.
"I'm not back." - Bill Harding, Twister
Originally posted by Philippe Maquet:
Many thanks to you, Jim and Eugene, for having entered the discussion to argue about why it's more efficient to access a HashMap by key than by value. "Not natural" in my mind (direct translation from french) meant "counterintuitive and less efficient".
Anyway, you are far more efficient than I can be to debate with Max. As we don't master the language (english) at the same level, sometimes I feel myself as he would feel himself if he had to kickbox against me with hands and feets bound...![]()
Now about that Map access discussion, I would argue that it is not a specification question. I use a HashMap (or a WeakHashMap) which are implementations. Any user of them gets their sources. Even if the doc is unclear (I don't pretend that), it's so easy to see (when you are curious as I am) what is the job performed by containskey() and containsValue().
I passed SCJP in may 2003 (correctly : 95%). One of the things you have to know to pass SCJP is how to recognize, among the few collections offered by the API, which is the best one to be used in a given context. It is one of the easiest parts of SCJP, because those classes are well named. It means that any junior java programer (SCJP level) knows that a HashMap is a Map (key-value pair), where keys are stored in buckets based on their hashCode(), a structure which favours acces by key over by value.
Well, anyway I initiated this thread just to handle a locking issue, especially to reply to Ulrich's question (I am not concerned by the given issue anyway), there is not any comment about the compared solutions above and ... Ulrich seems to have quit us. Some days, I'd better to go and sleep sooner![]()
Best,
Phil.
Originally posted by Jim Yingst:
[QB][Max]: I'm not aware of any specification based requirement that Maps be implemented in this manner, nor, AFIK, can you depend on them being so on different platforms.
Strictly speaking, that's true.
.snip...and common sense is that constainsKey() should be the same. Neither implementation says anything more about containsValue()'s performance (though TreeMap inherits the API comment from Map here, probably linear time). ...snip.. we can be pretty sure containsKey() will offer better performance for these two classes at least.
As for different platforms - well, HashMap and TreeMap are just using non-native code really, using standard Sun SDKs. The behavior might be different on other JDKs maybe, but not other platforms, IMO.
Also, the difference in speed here, given the size of the records, is immeasurable.
Originally posted by Philippe Maquet:
Hi Max,
Thanks for your reply.
I suppose that while writing "given the size of the records" you were thinking of the number of locks granted at a given same time. Even if the database has 100000 records, you probably won't have more than an average 100 records locked along them. So yes, the difference in speed will be immeasurable.
I would just comme back on the context of my first post here :
The first time I saw a locking solution based on a WeakHashMap mapping Data to records,
I asked : Why do you map Data to records instead of records to Data ? It seemed conterintuitive to me and moreover forced to limit the number of locks per Data to only one at a time. My question was stupid because Data (which represents the owner of the lock) must be the key in the WeakHashMap. I pointed out the lack of notification which made its solution buggy.
Now, from his answer, I got the feeling that his design had been driven by its code and not the contrary :
At first he had mapped records to Data (more intuitive for him too), and then he inverted the mapping because the WeakHashMap container forced him to do so. With the consequence that he couln't grant more that one lock per client at the time... Then he concluded that it was acceptable because in his current application he doesn't need more than one lock per client. So coding constraints made his final design.
I would like to add just one argument in favour of records mapped to Data as WeakReferences in a regular HashMap, against Data mapped to records in a WeakHashMap : avoiding the WeakHashMap overhead. As the latter stores its keys as WeakReferences, it needs an internal Entry class extending WeakReference, mainly to override equals() and hashCode(). While when Data are stored in the map as values, the "basic" WeakReference is enough, the default Object equals() (which is faster) being perfect for our purpose.
Thank you for all your comments, Max, there were very interesting to me.
All best,
Phil.
[ September 12, 2003: Message edited by: Philippe Maquet ]
Actually, the internal Entry class in the WeakHashMap class extends Map.Entry, not WeakReference.
"I'm not back." - Bill Harding, Twister
BTW, Philippe - I had the exact same reactions as Max reagarding your use of language. I wish I spoke French as well as you speak English, and I think you worry too much about your alleged deficiencies here. E.g. I had no problem figuring out what "counterperformant" meant. You're doing great; stop apologizing.
"I'm not back." - Bill Harding, Twister
Originally posted by Philippe Maquet:
Sorry Jim, but rereading me, I didn't understand my own english !So I couldn't explain it, even in french ...
It was quite vague indeed !
![]()
"I'm not back." - Bill Harding, Twister
Originally posted by Jim Yingst:
and we can't make any definitive statements about the performance benefits of checking to see if a HashMap contains a value versus it containing a key.
Can, did. Referring to both minimum guaranteed performance, and to de facto actual performace, but not to what might happen in alternate universes. If you explicitly say containsKey() rather than "containing a key" then I'll agree with you (thanks to another careless oversight in API writing), but as I can use get() to test the same thing with guaranteed O91) performance, I'm sticking with my answer; don't see a refutation. The issue is ultimately not that important though, so agreeing to disagree here is fine too.
"I'm not back." - Bill Harding, Twister
Originally posted by Jim Yingst:
Ummm, I can't tell what either of your code samples are supposed to do, Max. Doesn't contains() return a boolean? Doesn't equals() expect an Object?I can't determine how relevant get() is without a better example.
AFIK, the discussion was never about TreeMaps, map.get, or alternative universes
Well for our universe, for any existing HashMap implementation (or any Sun will ever release, I'll bet), containsKey() is more efficient than containsValue(). It's a pity they neglected to document that fact in the API for HashMap.
But to assert that we shouldn't state that HashMap containsKey() is more efficient than containsValue() - well, it is. Documented or not. You're apparently ignoring the very purpose of Maps - to look thing up by key. Looking up by value is an afterthought.
OK, gotta take a deep breath now. Hey Max, I don't suppose you wanna chime in on this thread? You know, if you get bored or something.Cheers...
[ September 15, 2003: Message edited by: Jim Yingst ]
"I'm not back." - Bill Harding, Twister
Originally posted by Jim Yingst:
OK, I retract my last couple statements about the purpose of Maps, and containsValue() being an afterthought. It's generally true, but yeah, there can be exceptions at the Map level.
I still stand by my assertion that for any HashMap implementation in existence, and any Sun is likely to release ever, containsKey() is (on average, for any significant value of N) faster than containsValue(). Note that I do not include classes that extend from HashMap in this statement - just HashMap's implementation. (I've been talking about HashMap's get() vs. MashMap's containsValue(), after all.)
For example, it would be easy enough to keep a reverse map in values to keys(with a few tricks: not too many), that would reduce the time of containsValue to be comparable to containsKey.
Agreed, they can be made equal. I've made a DoubleMap myself, and later seen similar stuff in one or more open-source libraries. So that's an exception for Map, but not HashMap.
Are you willing to bet your certification fee that it isn't? For that matter, are you willing to bet the certification fee of the people you're advising?
For Map: no, I misspoke. For HashMap (from the first half of the paragraph you said is "just nonsense": yes, if there were something to be gained if I win (e.g. if you'd but against me), and if we could agree on a suitable means of resolving the issue.
I suppose my assertion about future releases from Sun is unresolvable unless we were to limit the scope - say, up through JDK 1.5 when it's released?
Obviously, I can't bet the money of others, but I am indeed willing to encourage others to bet their money against you, and suffer their ire if I'm wrong.![]()
I suspect that if we pursue this we'd have a hard time agreeing on an exact formulation of what we're betting on. My basic premise is this: if an engineer says that Map containsKey() is more efficient than containsValue() (in casual conversation, and note the lack of the word "always"), that's OK, they're basically right.
And if another engineer immediately qualifies
the statement to say that HashMap's get() is more efficient than containsValue(), that's even more right, and making such an assertion is hardly something to be avoided; it should be seen as self evident.![]()
But I'm not sure how to construct a test of who's "right" for something like this.
"I'm not back." - Bill Harding, Twister
"I'm not back." - Bill Harding, Twister
Originally posted by Jim Yingst:
[QB][Jim]: It's generally true, but yeah, there can be exceptions at the Map level.
[Max]: Ok, we're getting there
This isn't something new,
[Max]: So, classes that extend HashMap are not HashMaps? Holy Convenient-if-Un-Object-Oriented-Qualifications Batman!
See Max, we really can't put all relevant qualifiers on every single sentence uttered. If I've clearly been talking about the HashMap class itself, let's try not to forget context when I omit a qualifier in one whole sentence, then reestablish it the next.
Actually, Hashmaps might very well implement this in the future, and be 100% spec compliment, so this is an unfounded statement.
The HashMap class itself? Are you talking about something actually planned, or a theoretical possibility?
It would surprise me greatly if Sun put this in HashMap itself, as it would roughly double the memory allocation for HashMap, for something that isn't generally required.
[b][Jim]: Are you willing to bet your certification fee that it isn't? For that matter, are you willing to bet the certification fee of the people you're advising?
[Jim]: For Map: no, I misspoke. For HashMap (from the first half of the paragraph you said is "just nonsense": yes, if there were something to be gained if I win (e.g. if you'd but against me), and if we could agree on a suitable means of resolving the issue.
I wish to make the observation that its containsKey() method is generally more efficient than its containsValue() method. Is there some way to say this simply without having to spend thousands of words justifying it later? Or should good engineers just never mention the possibility that containsKey() is faster than containsVlaue(), as it's just too complex and dangerous?
Originally posted by Jim Yingst:
[Jim]: Obviously, I can't bet the money of others, but I am indeed willing to encourage others to bet their money against you, and suffer their ire if I'm wrong.
[Max]: Well, ok, but you can see where I might not be comfortable with that?
OK. But these are adults here after all.![]()
[Jim]: I suspect that if we pursue this we'd have a hard time agreeing on an exact formulation of what we're betting on. My basic premise is this: if an engineer says that Map containsKey() is more efficient than containsValue() (in casual conversation, and note the lack of the word "always"), that's OK, they're basically right.
[Max]: in casual conversation, yes, I agree. They might assert anything, foolish or not, an casual conversation.
OK, let's talk about in casual conversation in which one does not wish to be thought a fool. Did Philippe's statement make you think he was a fool? What about my statements?
[Jim]: And if another engineer immediately qualifies
[Max]: I would say corrects
True. I explicitly replaced the original statement with an alternate formulation.
[Jim]: the statement to say that HashMap's get() is more efficient than containsValue(), that's even more right, and making such an assertion is hardly something to be avoided; it should be seen as self evident.
[Max]: Ah, but there's the rub. It's not more right in general, it's only more right in specific cases.
Are you arguing with Philippe's statment about Map, or mine about HashMap?
[Max]: I could(or you could) code up an implementation in about 10 minutes that has a containsValue that's just as fast as a get. To me, this just isn't a responsible statement.
But it wouldn't be HashMap's get(), or HashMap's containsValue(). Which is how I referred to those methods. I suppose you could make your own JDK by taking the reference implementation, modifying it, and releasing it as a new JDK. But you haven't alrady done that, have you? A future release from you doesn't count.![]()
[Max]: Well, I think you would be right, if this weren't an implementation issue. However, since that's all it is, then I'm right
Well, it seems we have a failure to agree on what we were talking about in the first place. I thought my initial comments directing our attention to HashMap (and TreeMap) were pretty clear; guess not. Oh well. I guess my interest in most of this discussion has diminished, except for the last paragraph in my previous post. If, in the future, someone wants to make a really simple assertion assertion along the lines of "get() is more efficient than containsValue()" - is there some level of qualifiers that they can insert that would allow them to make this simple statement without it being a big issue?
[ September 16, 2003: Message edited by: Jim Yingst ][/qb]
How about "In general, I believe that Hashmap.get() is faster then hashmap.containsValue(). I can't prove this, and would never assert it as anything more then my belief..".
"I'm not back." - Bill Harding, Twister
"I'm not back." - Bill Harding, Twister
Originally posted by Jim Yingst:
[Max]: I'm saying HashMap could change in JDK 1.4.3.
Are you saying you have reason to believe this is likely, or just that it's possible? If it's the latter I believe I'm covered by the "I bet" clause. If the former, I'd be very interested in hearing details.
Sure: just don't make me worry about you by implying that you'd actually put it in your choices document
Well, not unless it seemed more important to my design than it actually is here. But take FileChannel vs. RandomAccessFile. (No, I'm not trying to bring up our points of contention regarding atomiciy.) We agree that, speaking generally, FileChannels are pretty fast, and likely to be a lot faster than RandomAccessFile, right? I know that this isn't the only reason to choose them, and speed is far from paramount. But efficiency isn't a complete non-issue, as long as you don't let it override simplicity and maintainability. So, in my design doc, can I at least mention that one reason to favor FileChannel is that is (generally) really fast? As far as I know there aren't any concrete guarantees about this; the API avoids that sort of thing. I won't say "really fast" of course. Perhaps something like "FileChannel and related NIO classes often achieve signficant performance benefits compared to RandomAccessFile or streams." I mean, I could add "on most commonly-available implementations..." etc, but isn't that kind of overkill?
I have a book here where the authors say "FileChannels' interaction with the file system is much more controllable and efficient than the previous I/O implementations." It doesn't say "usually". It doesn't say "except maybe for some rare exceptions from obscure custom-made JDKs that no one will ever use anyway". I mean, I could release a JDK in which any FileChannel read() takes at least full second longer than RandomAccessFile's read() method does. I don't belive this would actually contradict the API in any way, but perhaps I've overlooked something. Barring that possibility, I guess the statement isn't strictly true. Shall I notify the authors that their book is dangerously misleading on this point? Maybe I can get them to add: "...but don't ever think of mentioning this in your design doc."![]()
Space pants. Tiny ad:
Smokeless wood heat with a rocket mass heater
https://woodheat.net
|