Win a copy of Classic Computer Science Problems in Swift this week in the iOS forum!

Stevens Miller

Bartender
+ Follow
since Jul 26, 2012
Stevens likes ...
C++ Java Netbeans IDE Windows
Forum Moderator
Stevens Miller currently moderates these forums:
I'm a military brat (Navy) who has not yet figured out what he wants to be when he grows up. I have programmed computers for over 40 years. I have a law degree. I've been elected to public office. I love science fiction. I love to cook. I love my wife and my son (more than even science fiction and cooking). I run the lights for community theater productions. I was born in Japan (that Navy brat thing again). I am a capitalist who believes in socialized health care, because I think retail health care makes as much sense as retail national defense, and I certainly believe in a strong national defense. Also, if it weren't for a certain amount of socialism, every street in America would be a toll road. But equal opportunity ought not to mean guaranteed results. Anyway, that's how I see it. You?
Northern Virginia, USA
Cows and Likes
Cows
Total received
31
In last 30 days
1
Total given
22
Likes
Total received
245
Received in last 30 days
0
Total given
173
Given in last 30 days
1
Forums and Threads
Scavenger Hunt
expand Rancher Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by Stevens Miller

Further to the above (and more on point), I found a good article online that both answers your question and does a nice of job of explaining why the only real debate should be over signed versus unsigned bytes.

The author says that these are the operations that behave differently depending upon whether they are applied to signed or unsigned data:

/, %, >>, <, <=, >, >=, up-casting, parseInt(), toString()

He recites some of the common work-arounds when you need to work with unsigned data (all of which add overhead), then goes on to argue that (mostly) the case against adding them to Java now can be made by noting the large number of API calls that would need to be overloaded to accept them. However, he also has written another good article arguing that signed bytes in Java are a mistake.

Taken together, I think these two essays really capture my view and they also made me realize that it is only the absence of unsigned bytes (and not of unsigned integer types of other widths) that ever annoys me. Since I would challenge others to show me a case where a signed byte was the better choice than, say a signed short (or just an int), I would also conclude that the author is correct: Java bytes should have been unsigned in the first place, and bytes are really the only unsigned types anyone needs (I note, of course, that Java's char type is unsigned, but I doubt it is much regarded as actually numeric in the first place).
4 weeks ago

Stephan van Hulst wrote:You have some very good points here, I must admit.


Thanks (and for the Bos Taurus  ).

So, relational operators depend on the whether the data types of the operands are signed. Any other operators you can think of?



I have never looked into it, as the confusion I faced when I ran into the problems we've just reviewed first cropped up kind of put me off of the whole issue. When I was a Java newcomer (c. 2009), I ignorantly assumed it would have the same native types C has. I found out I was mistaken when I wrote a few lines that used relationals. I did some experiments with suggestions such as you made above, and a few others that one can easily find with Google. Interestingly, most of the ones I found required CPU cycles that unsigned bytes don't need. What I mean is that a lot of suggestions involving bit-masks and assignment to wider types are out there, and they all work. But they also all take CPU cycles. When you're doing real-time work, that's not very appealing. A number of folks also point out that the JVM does its arithmetic on integers anyway, so you're already using wider types "under the hood," but that's still faster than if you have to manipulate the data at the Java level.

Please correct me if I'm wrong on any of the following, but here's what I also think I discovered in those investigations: First, adding unsigned bytes to Java is really a non-starter, as they don't even exist in the JVM. Either the language would have to compile to code that widened (or otherwise processed with additional overhead) signed bytes, or else the JVM itself would have to be changed. I'm no expert on the JVM, but my impression is that adding unsigned bytes to it is just not going to happen. Second, image processing usually involves operating on enormous (well, they seem to big to me ) arrays. Again, our friend the JVM comes into play, as it checks boundaries for each such access. Crusty old C programmers like me are used to having to check our code to be sure it won't try to access data outside of a structure (and almost equally used to finding out we're not as good at that as we wish we were). The point here being that, if one is going to access millions of bytes in an array, and one wants to do it in a really short time, maybe Java isn't the ideal choice. (I am reluctant to say that, as it tends to provoke the faithful, but there it is.)

Now, to get around all this, I learned how to use the JNI. That lets me "drop back down" to C when I need it. Except... to operate on a buffer full of bytes, you need to "pin" it into place (so the JVM and the GC don't start playing hide-the-ball with your data). As best I can divine from the specs, when you pin a buffer, you can ask for it to be given to you directly, but that's only a request, not a command. The JVM is free to make you a copy of your data instead. When you unpin the buffer, the JVM copies your data back to the original space. My experiments with Windows and the JREs I've used tell me that it gives you a copy every time, never actually honoring the request for actual access. Admittedly, it does the copying in about a millisecond, which is usually fast enough. (At old-school video rates, you have 33ms per frame; newer stuff cuts that in half, or even smaller.) Still, when you are doing real-time work, losing a millisecond just to copy data you would be happier working on directly is overhead you'd rather not have. My "solution" was ultimately never to have the data under control of the JVM at all. My work is all specific to Windows, so I can go ahead and make platform-specific API calls. What I did was allocate my memory in native code, and just keep it there. Now, if you are thinking that creating and manipulating video frames in native code, while using Java to control your application, is a complicated mess, I will agree with you. In particular, displaying my native video in a Java window was, well... challenging. It also results in the kind of code that Java champions despise, relying on heavyweight GUI objects and other hairballs that are all likely destined for elimination very soon.

What happened next caught me off-guard: I started doing Unity development, which requires one to learn C#. Well, I really thought I was too old to be learning yet another programming language, but C# deals with a few issues differently than Java deals with them, and nearly all of those differences are to my liking. That is, I found I preferred C# (though I share your fondness for Java's amazing libraries). Mind you (and all you ships at sea, monitoring this scintillating exchange), I am not sayiing C# is a better language than Java. There's no future (and no end) to that discussion. But I find that C# and I get along better than Java and I do, for the kind of work that I am doing. One reason is that C# has unsigned integer native types, and I like that. (I could list a few more reasons, but I suspect that would only create pointless turmoil.)

Back to your question, though: Would it be accurate to say that addition itself is dependent on the signed/unsigned nature of the operands? As we've discussed already, you can add your way "up" to a negative number when using signed integers, while you can add the same data and get a positive result if your operands are all unsigned. Is that a feature of the + operator, the = operator, or of the JVM? Like I said, I've never considered it before, as I just found myself happier with the JNI and C for my image-processing needs under Java (and now, under C#, where the "JNI" is somewhat simpler).
4 weeks ago

Stephan van Hulst wrote:Because the onus is on those who want a feature added to the language, not those who want to preserve the language as it is.


Considering how much Java borrowed from previous languages, I would say I am the one arguing for preservation of a feature, and that Gosling took it out. But the real issue isn't whether I want something added. It's whether there's a use case for unsigned bytes or not.

Your argument basically comes down to "I want the unsigned keyword because I'm not interested in solutions that don't use the unsigned keyword".


Actually, that is the result of my argument, not my argument itself. My argument is what I said it was: images are encoded in bytes over the range [0, 255]. So that is the numerical space I am operating in. If I have a data type that maps to that space, I will use it. If I don't have it, I will want it (and will have to work around not having it).

The source of your problem is that you're using magic values in code. If you declare constants, you don't have this problem:


I disagree that your solution derives from replacing magic values with constants. Rather, it derives from down-casting a positive integer into its negative byte form. Actually changing a numeric value through a cast is, to me, a code smell (at a minimum). If I ever have to look at those values numerically, I'm going to be badly confused by what I see. As a simple human being, I think 200 == 200, and that 200 > 104. It's not going to come easily to me that (byte)200 == -56, and even less easily to me that, in this context, -56 > 104.

Now, your version succeeds for the values I gave, but only if I am willing to accept the proposition that 200 == -56, which it only does as a result of two's complement arithmetic, not because any positive number ever actually equals any negative number. For me to think this obviates any justification for unsigned bytes, I will need to make some kind of contract with the language that pixels with values of, say -47, are of brighter intensity than pixels of value 104. That's asking rather a lot of me.

Also, don't things go rather differently if you use values other than my two original examples? Your version works, but only because the specific example I gave overflows and the sum of the two positive values becomes a negative value greater than -56. What if it doesn't overflow?

If bytes were unsigned, that would work. With signed bytes, it doesn't.
4 weeks ago

Stephan van Hulst wrote:I am one of those folks who believes the lack of an unsigned keyword in Java was the right choice. To date, I have to find a good argument in its favor. People usually mention image processing, but they never give a concrete example of an operation that's easier with unsigned data types than with their signed counterparts.


Well, not to rehash this to the point of absurdity, but my recollection of how I came to know Java had no unsigned bytes was when I wrote some code to clamp the intensity of pixel values. I think it was something like this:


As C#, that code compiles, runs, and leaves 200 (0xC8) in sum, which is what I want. As Java, you can still compile that code, but you get a helpful warning that it may not work as you expect (because Java notices you've assigned a value out of the range of sum with sum = 200). If you run it, sum ends up with -48 (0xD0, which is 208 when regarded as unsigned) in it, which is not what I want. (And note that the line that generated the warning isn't involved, as it never executes.)

There are, indeed, a lot of ways to rewrite this so it works. Of course, I would not be interested in any that add to its run time, but there are probably others that don't. However, all of them have one feature I don't like: they are not the code above, which is the natural way that (if I may presume to speak for all of us) a person writing image-processing code would do it.

Now, arguments abound as to why this is not a good way to write it, chief among them being that it presumes red1 + red2 will sum to a value below 256, even when treated as unsigned values, and that may not be true. But, again, that's the code I wrote and, if it meets my needs, I am not much moved by arguments that it might not meet some other need that I don't have.

Heh, you mention that people who do image processing have my complaint, but we don't offer examples. To keep this symmetrical, let me say that I know lots of Java programmers who agree with the decision to leave unsigned integers out of the language, but I note that none of them do image processing. 
4 weeks ago

Stephan van Hulst wrote:I've really disliked C#'s naming conventions from the very start. There's no real advantage in looking at an identifier and knowing it refers to a method, field or property through the casing alone. A member is a member is a member.


Agreed. My guess is that C#'s conventions are attributable to it being a child of Microsoft. Their various APIs have not followed a universal convention, but the C# rules and Win32 are pretty close. At least we are past all that m_<name> baloney (and don't even get me started on the misbegotten genesis of "Hungarian" notation). When I started learning C#, its similarities to Java and the fact that Java's conventions work pretty well (in my opinion) had me just continuing with what I used for Java. Little did I know that, of course, Things Are Different Here.

It would be more sensible to use a difference in casing to indicate that a member is static, for instance. A huge disadvantage of the current conventions is that I can't see if an identifier refers to a type, a property or a constant.


IT_IS_ALWAYS_SOMETHING

A really annoying situation is when I have a nested type, and in the enclosing type I want to add a property of the nested type. In many cases you would want the names to be the same, but you can't declare two things with the same name.


I believe it is a law of nature that every programming language must include at least one bothersome and unnecessary restriction for no better reason than the language designer found the lack of that restriction bothersome in some other language they once used. Although I know some people believe they can defend it, the fact that the lack of unsigned native integral types in Java can be attributed to James Gosling's claim that most of us don't understand them will always feel like "the gentleman doth protest too much." Maybe James Gosling doesn't understand them, but I do and so do countless other programmers who do image processing work. That's one thing C# gets right. But every language seems to have these types of quirks.

Stevens Miller wrote:This prevents what I would have thought was an advantage of properties, which would be that if you needed to replace a field with a property (so that you could so some processing in that property's getter/setter code, whenever it was accessed), client code wouldn't have to change.


Yup, this is pretty stupid. Since this is the case, I hope that you still never give fields higher visibility than private.


Oh, I do that all the time, in both languages. But that's just because I'm lazy. I always clean up after myself later, once it's all working.

Depending on my mood, I don't use fields at all. Sometimes, just so all my data are grouped together and also conform to the style rules, I will forego fields completely and just use auto-properties.


That's a pretty good practice. I may take that up myself.

What kind of work are you doing in C#? I only get invested in it because I started doing Unity development.
1 month ago
To be fair, if you think of the temperature and the time the temerature was submitted as a composite value, then that value has changed whenever a new reading comes in (assuming no two readings are submitted at the same time).

There might be other cases where you would want to react to the fact that data had come in, as kind of an independent source of information from what the data was. For example: you have a microcontroller (an Arduino, or Raspberry Pi, maybe) that periodically sets a flag indicating that it is operating properly. Your client code might restart a timer whenever this happens. If the timer ever expires, your client code would reboot the microcontroller. However, if the microcontroller makes its own determination that it is malfunctioning, it might clear that flag, giving the client code the option to reboot the microcontroller immediately instead of waiting for the timer to expire.

Or, maybe the value written is being accumulated elsewhere. Just picking something entirely at random, maybe you are keeping track of how much time a customerplayer has left. If the player buys more time, they might do so in a series of increments of ten minutes each. You'd want to add that ten minutes to whatever total time remained at that point. Most likely, that would call for a method, not a property, but that takes us back to the fact that properties are largely sugar.

One of the funny things about moving from Java to C# that I have noticed is that, although a property and a public field can be hard to tell apart (nigh impossible if they have none of the side-effects we're discussing here), the usual practice is to format field names in camelCase, while formatting property names in PascalCase. This prevents what I would have thought was an advantage of properties, which would be that if you needed to replace a field with a property (so that you could so some processing in that property's getter/setter code, whenever it was accessed), client code wouldn't have to change. Indeed, if you didn't break anything, client code wouldn't know the difference. But if you have public int count; and you want to make count a property, "The Elements of C# Style" say you have to rename it to Count. (The same book says, at Rule 37, that properties should be named after the items they get or set, and gives ExpirationDate as an example property name that returns a Date value stored in expirationDate, but says earlier, at Rule 20, not to rely on case alone to differentiate names, so maybe it's not quite The Word of God.)
1 month ago
I don't like decoupling the recordation of the time of last update with the recordation of the temperature. Shoving it down into the temperature setter makes it impossible for client code to forget to do it, and simplifies the listener setup (since you only have one list of listeners listening to one kind of event).

I couldn't name one right now (having not used it for a bit), but I believe Swing makes use of both approaches. That is, some controls only fire events if their values change, others fire every time they are assigned, whether they change or not. Appears to have been a judgment call on the designers' parts as to when client code would want it and when it wouldn't. I prefer always getting the event, myself, as it's easy to keep a cached value and choke off any processing that should be choked off if the value doesn't change. If you want that centralized, you can just add a wrapper to the notifier that does the choking for you.
1 month ago

Stephan van Hulst wrote:I don't really understand, what additional considerations does that use-case introduce?


Well, if two consecutive temperatures are the same, no second event will fire. Listeners might include graphic displays of temperature-over-time.
1 month ago

Stephan van Hulst wrote:For instance, if my setter kept track of how often a property was changed, or if it notifies listeners, it must do nothing if the actual backing data store was not changed:


How would you approach it if the setter were, say, called hourly by client code reporting a temperature reading?
1 month ago

Stephan van Hulst wrote:I once found the following statement in our code-base that an intern had written:
I removed the line and our product stopped working. I put it back in and then yelled at the intern (not really, but I was quite upset).



If I had been the one telling that story, it would probably have ended like this:

I put it back in and then yelled at the intern (until I remembered that I was the one who wrote the property code).


In that example, it need not have been the getter that changed state. The setter might, for example, have been logging changes, requiring at least two sets before logging its first change. Reading an empty log might have been problematic, so your intern might have been thinking that a first "change" of zero would at least let the log be readable in all cases.

That said, I think your example is a powerful case against properties that change state in a way that is visible to client code. "Our product stopped working" certainly meets that rubric.

Microsoft has some guidelines on this that I think are helpful.

1 month ago

Fred Kleinschmidt wrote:Wouldn't something like this be simpler?


Compared to Stephan's breath-taking, if sometimes intimidating, mastery of generics, almost anything is simpler  . Your approach is fine, but I wanted the shortest possible cut-and-pastable line that could be inserted ahead of code that needs to be entirely avoided if the lockout flag is set. Your method requires three lines (well, two if one uses K&R formatting, as you have; I'm an Allman formatter, myself). It also requires that one indent all the lines conditioned on the lock (which is not a big deal, since most IDE editors would do that for you).

I know I said I was deliberately leaving thread-safety issues out of this, but putting the test-and-set into its own method makes it easy to interlock access to the flag, which one would probably want to do for production code.

Truth is, I got interested in this issue because I am actually using C#, which has getters and setters hidden by syntactic sugar, and my original code looked like yours. It occurred to me that, if I replaced the lockInEffect member variable with a property, I could use the hidden getter to return the prior value of the flag, while always setting it true before returning. That seemed like a neat way to do it, at first blush, since there was no way to forget to set the flag and no need to make a method for it. I did some Googling and found that getters that change state are mostly frowned upon (though, again, the fact that a getter is a layer of indirection designed to allow some processing--rather than direct access to a member--renders that criticism inconclusive). Atomic test-and-set is so common that it appears in the instruction sets of many CPUs (x86 acquired it with the '386; the VAX 11 had it in the late '70s).

As often happens to me, I started with one question (should getters avoid altering the values they get?) and moving on to another one (what are some options for test-and-set mechanims?). Nabakov said he never learned anything from the characters in his stories. They were "galley slaves" who only did what he told them. Other writers say they discover new aspects of a character while incorporating them into narratives. As a programmer, I am rather more the latter, than I am Nabakov.
1 month ago
I really like your approach, by the way. The question of a getter changing state is obviated by using a test-and-set method that operates on a flag.
1 month ago

Stephan van Hulst wrote:I didn't realize that was a requirement, as you wrote the following:

Stevens Miller wrote:The lockOutInEffect flag is cleared by a different method, at some appropriate future time.



Ah, right you are! Got misled by the specifics of my own problem, which I did not describe.
1 month ago

Stephan van Hulst wrote:



Where do you clear the lock? In "do stuff?"
1 month ago
I have a number of methods that might be called during a period of time when they should just return. This condition (that is, that the return period is in effect) is indicated with a boolean flag. When any of these methods is called outside of any such period, they set the flag. Thus, I have a lot of methods that start with this code:


The lockOutInEffect flag is cleared by a different method, at some appropriate future time. (None of this is thread-safe; see my note, below.)

Now, I think that Lines 3 through 8 in the above method are kind of ugly to have all over the place, and are a pain to keep pasting everywhere. I was wondering if this would be an acceptable alternative:



This replaces the original six lines with one line (Line 18), "if (getLockOutInEffect()) return;." I can see some reasons to object to this version, and some to endorse it.

Objections:

1. It's a bad idea to omit curly brackets around conditional code.
2. Getters that change their underlying properties are debugging nightmares waiting to happen.

Endorsements:

1. One line is a lot prettier than five lines (and that line could be "if (getLockOutInEffect()) {return;}," for the sake of curly-bracket purists).
2. Getters are merely an abstraction anyway, so why restrict them against changes?

I think the big issue is really Objection (and Endorsement) 2. But this particular idiom exists in another form, and is widely used. It is the test-and-set operation.

Now, I suppose I could finesse the question by changing the name of the getter to something more descriptive, like "lockAndWasAlreadyLocked()," to make it more explicit that this isn't really a pure getter. (That's an awkward name, but it makes the point.)

What do folks here think? Is there an accepted idiom for the test-and-set operation? (Please note: I can see all kinds of thread-safety issues in the above. Those would be dealt with, of course, in any context where they mattered. For the sake of keeping this question simple, I've ignored that issue here.)
1 month ago