Win a copy of Cross-Platform Desktop Applications: Using Node, Electron, and NW.js this week in the JavaScript forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic

Behavior of incomplete non-blocking reads  RSS feed

 
David Weitzman
Ranch Hand
Posts: 1365
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Suppose you call Selector.select() and learn that there are new bytes available to read(). You call Channel.read(ByteBuffer) and notice that at least enough new bytes have arrived to fill your entire ByteBuffer. That means that there may be more bytes actually available. Should you immediately call read() again in a loop until it returns 0, or will select() still detect the unread bytes if you call it again? The second method seems like the only one that can be safely used, but the NIO documentation doesn't specify if the same ready-to-read bytes can be selected more than once.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The second method seems like the only one that can be safely used
Why is that? I'm not sure what the concern here is.
but the NIO documentation doesn't specify if the same ready-to-read bytes can be selected more than once.
I think it does. That is, the selection process doesn't report anything to you about how many bytes were available at the time of selection - only that something was available. If a channel has any bytes available to read, the selector will report it. It's entirely possible that the number of available bytes can increase before you get around to reading them anyway - there's no possible way for you to tell the difference between how many bytes were available at the time the selector performed the select(), and how many are available now. So you can't possibly limit yourself to only those bytes that the selector originally knew about. Just read whatever's available, and then the next time select() is called, the selector will report on any channel that has available bytes at that time. Whether those bytes were previously available, or not.
Sample code in Ron's book uses this approach. Check out SelectSockets.java.
Note that there is a potential gotcha in the API along similar lines to what you seem to be concerned with. If you look at the int value returned by select(), that reports the number of newly updated entries in the selectedKey() set. If there were entries put there in the previous select() call which you did not remove, they will still be there on the next select(). But they won't be included in the int count returned by the select. This is OK, as I don't think you really need this return value for anything anyway. Just grab selectedKeys() and iterate through whatever's there. But if you are paying attention to the count and using it for something, you need to remember that it's only reporting new stuff. If after one select() you leave behind some selected keys that you didn't have time to deal with, then on the next select() you may get a return value of 0. That doesn't mean there's nothing available to read; it just means there are no new channels to read that weren't previously reported. Either way, all readable channels can be found by iterating through selectedKeys().
 
Ron Hitchens
Author
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
David: You can do it either way. You can either fully drain a channel using repeated reads or you can take one bufferful and then go back around the select loop. In that case select() will return immediately because the channel will still show as ready to read.
Jim: Yes, I agree the return value from select is confusing and not very helpful. It's semantically different than the POSIX select() call. My recommendation is to just ignore it are iterate over the list of keys returned (which could be empty).
 
David Weitzman
Ranch Hand
Posts: 1365
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thanks for clearing that up.

Me: The second method seems like the only one that can be safely used
JY: Why is that? I'm not sure what the concern here is.

Because if new data is constantly arriving you'll end up in an endless loop handling a single connection. For some use cases that's just fine, but on the server end it's good to avoid connection favoritism. Alright, it isn't a very likely scenario, but it's still possible and thus good to prevent.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Because if new data is constantly arriving you'll end up in an endless loop handling a single connection. For some use cases that's just fine, but on the server end it's good to avoid connection favoritism. Alright, it isn't a very likely scenario, but it's still possible and thus good to prevent.
OK, that's a valid concern. My first thought was to maximize overall throughput by grabbing whatever's available without waiting to go back through the selection process - but I suppose in most cases that's not necessary; you can use a big enough buffer to grab whatever's available at the time. And if the buffer's not big enough, well, the overhead from selection should be pretty trivial in comparison anyway. So sure, reading one buffer-full and then re-selecting is probably preferable for most applications.
Note that Ron's got another example, SelectSocketsThreadPool.java. Here a single thread monitors a large number of sockets with a selector - but the actual reading of ready channels is delegated to separate worker threads. So we don't need one thread per socket, but can make do with a much smaller number of threads servicing those sockets that are actually doing something. This provides a level of insulation against connection favoritism. Once a connection has a worker thread's attention, it has a certain measure of preferential treatment it's receiving - but as long as there are other worker threads available in the pool, other connections won't be locked out. There's still some vulnerability to, say, a DOS attack where the attacker is able to get a number of connections equal to the worker thread pool size, and can saturate all those connections with incoming bytes. So maybe it would still be a good idea avoid looping the read() from a connection. But at least this design is much less vulnerable than the original. (And of course, Ron's code isn't intended to be a complete robust application - just a rough sketch really.)
[ April 17, 2003: Message edited by: Jim Yingst ]
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!