• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • Liutauras Vilda
  • Jeanne Boyarsky
  • paul wheaton
Sheriffs:
  • Ron McLeod
  • Devaka Cooray
  • Henry Wong
Saloon Keepers:
  • Tim Holloway
  • Stephan van Hulst
  • Carey Brown
  • Tim Moores
  • Mikalai Zaikin
Bartenders:
  • Frits Walraven

FileChannel and thread safety

 
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
This is Jim Y here, intruding on S Bala's original post. These posts were originally in another thread here, but that discussion got too involved for me to follow while it was still embedded amidst several other discussions, so I'm now extracting a group of relevant posts into a separate thread. Which turns out to be a lot of work, but hopefully it's worthwhile to be able to focus on one aspect of the discussion. Or, thus of you who didn't care about this level of detail anyway can more safely ignore it.
So - the part I'm interested in here is: can FileChannels be accessed safely by multiple threads, and if so, how? Are there any particular dangers to beware of? Some of the posts below also talk about other issues which were part of the original discussion, and I'm not going to edit all those references out. But please let's keep this particular discussion focused on FileChannel's thread safety (or lack thereof) . For other topics, please either post in the original thread, or start a new thread. Thanks.
Below this line is S Bala's original post:
----
Hi,
I was reading on FileChannel and its abiliy to support thread safe operations.
URLuBird 1.1.3 says that we may assume there is only one program calling the database at any time.
Does that mean that we can do away with synchronized read and write operations (ignore), and worry about only the concurrency on the server.
thanks
[ July 26, 2003: Message edited by: Jim Yingst ]
 
Bartender
Posts: 1872
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi S, (?)
About FileChannels :

Does that mean that we can do away with synchronized read and write operations (ignore),


Reading this in the FileChannel doc :

Other operations, in particular those that take an explicit position, may proceed concurrently; whether they in fact do so is dependent upon the underlying implementation and is therefore unspecified.


I would be cautious (as I'll be now in this thread with anything I may write ). In fact, I am not sure about the correct interpretation to be given to that doc excerpt.
As all my write operations are queued and handled by a dedicated thread, my design is not concerned by such a potential thread safety issue, and I didn't go into deeply.

and worry about only the concurrency on the server.


Right.
Cheers,
Phil.
 
S Bala
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Philippe,
I was reading on FileChannel to see what it had to offer regarding thread safe file operations.
But, my concern (3rd posting) is to whether I can ignore altogether threading issues with the database -- As my assignment states that "I can assume that only one program is accessing the database at any time". So the client checks the record lock in memory for update and delete operations, obtains it and does a transaction with the database. Hence no need to synchronize the read/write operations.
thanks,
SB
 
Philippe Maquet
Bartender
Posts: 1872
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi S,

As my assignment states that "I can assume that only one program is accessing the database at any time". So the client checks the record lock in memory for update and delete operations, obtains it and does a transaction with the database. Hence no need to synchronize the read/write operations.


IMO, "only one program" means "only your application". But how may you assume that two of your clients will not write concurrently in your database (on different records) ? Of course, if FileChannel handles that concurrency well (using an explicit position in the read/write methods), it seems correct.
Cheers,
Philippe.
 
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Greetings, everyone. Once more into the breach...
was reading on FileChannel and its abiliy to support thread safe operations.
URLuBird 1.1.3 says that we may assume there is only one program calling the database at any time. Does that mean that we can do away with synchronized read and write operations (ignore), and worry about only the concurrency on the server.

No, if you use synchronization at all, it's as a defense against other threads in the same JVM - meaning, other clients of your DB server. So depending on your design you may or may not synchronize reads and writes - but this desition has nothing to do with other programs; just this program.
Philippe is correct to be wary of FileChannel's thread safety, IMO - the guarantees it offers are not absolute. Two threads could write to the same section of file at the same time, or one read while another is writing, and these could lead to some strange results. However your locking mechanism should protest against most of this, without additional synchronization. It should be impossible (thanks to record locking) for two threads to update the same record simultaneously. So the only possible problem is direty reads - what if one thread reads a record while another writes? The record which is read may be corrupted, a mix of old data with new. The chance of this is very small really, and it's probably not a big deal since it only affects the record displayed on the search screen - the real record remains intact in the DB file. However if you do wish to eliminate this possibility, you would need to synchronize both read and write methods, IMO. Or perhaps, create some alternate scheme. I think many people ignore the possibility of dirty reads and are not penalized for it.
[ July 26, 2003: Message edited by: Jim Yingst ]
 
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Jim Yingst:
Greetings, everyone. Once more into the Philippe is correct to be wary of FileChannel's thread safety, IMO - the guarantees it offers are not absolute.


Actually, you can trust the guarantees, but you have to be careful about reading too much into them. For example, a write is atomic, if you don't move the position explicitly, and allow the FileChannel to write to it's next natural position. Jim, this is all on pages 284-285 of my book, so you should know this
But seriously, I suggest that you use FileChannel at every turn: they're the cats' meow, as far as Java File IO is concerned. And what better to start digging into them then now?
M
M
M
 
S Bala
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Jim, Philippe.
There is a sentence in the assignment which can help put a different perpective.
-- quote---
because the data must continue to be manipulated for reports using another custom-written application, the new system must reimplement the database code from scratch without altering the data file format
---- end quote---

Even if we implement a synchronized write operation for our clients, there is no guarantee that, the exisitng custom application for reporting "will not" try to read/write at the same time. We have no way of dealing with this situation.
Does this validate my earlier assumption?
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Howdy, Max.
Actually, you can trust the guarantees, but you have to be careful about reading too much into them.
I agree with this. When I said "the guarantees it offers are not absolute" I meant that the guarantees do not promise everything we might think or hope that they promise, and we must be careful with them. I do believe that the guarantees are obeyed (probably, excepting some possible bugs which will be fixed soon if not already).
For example, a write is atomic, if you don't move the position explicitly, and allow the FileChannel to write to it's next natural position. Jim, this is all on pages 284-285 of my book, so you should know this
I read this, but don't believe it offers the level of security you think it does. We need to provide random access to any record in the file, not just the next record. So either we have to use read/write methods that take an explicit position, or we set the position and then use a read/write that implicitly uses that position. You seem to advocate the latter approach. And it's true that the atomicity of the read/write is guarantted by the API. But what is not guaranteed is that there will be no interruption btween setting the position and performing the read write. That is, if we do
channel.position(100);
channel.read(buffer1);
while another thread does
channel.position(200);
channel.read(buffer2);
we may end up with something like
channel.position(100);
channel.position(200);
channel.read(buffer2);
channel.read(buffer1);
Here buffer2 gets what buffer1 was expecting, and buffer1 gets whatever is after the record at 200. This is no good; we'd need explicit synchronization to prevent interruption between the position() and read() methods.
Alternately, we can use the read/write methods that take an explicit position:
channel.read(buffer1, 100);
channel.read(buffer2, 200);
This works great as far as the reads are concerned. According to the API these methods may even process concurrently, but they're still guaranteed to each read from the appropriate place.
However, what if we have two thread doing:
channel.read(buffer1, 100);
channel.write(buffer2, 100);
According to the API these may also proceed concurrently. That's a potential problem. The read() will read from the correct position, but what it's reading may change underneath it. You may get a read that starts out with data from before the update, and ends with data from after the update. Unless, again, you protect your methods with synchronization.
I do believe that this type of dirty read is pretty unlikely, especially considering the record legnth is just 183 in my assignment. The system will probably finish each read/write atomically. And it's quite possible that on some particular implementations of FileChannel, this atomicity will always occur. However, it's not guaranteed by the FileChannel spec. So I advocate explicit synchronization if you want to ensure that dirty reads do not occur.
Incidentally, since I used read(ByteBuffer) above - I also believe that FileChannel offers no guarantee that the ByteByffer will be filled, even if the file has more bytes. If we had a SelectebleChannel in blocking mode we'd have a guarantee, but FileChannel does not offer such a guarantee. So to ensure that the ByteBuffer is filled, we really must use a loop for each read():

This is annoying because again, it's ver unlikely that we'll have a problem for records of as small as ours. And some implementations may well never have this problem. But, the FileChannel API makes no guarantee. This isn't new really - all the InputStream and Reader classes had the same issue, with the exception of a few specific methods like BufferedReader's readLine() and RAF's readFully(). But annoying nonetheless, and frequently overlooked. Even by famous book authors.
But seriously, I suggest that you use FileChannel at every turn: they're the cats' meow, as far as Java File IO is concerned. And what better to start digging into them then now?
I agree. Except that the header for the Contractor assignment has a format that is such an exact match for RAF spec (or more generally, the DataInput interface) that it seems a shame not to take advantage of that. Using FileChannel for the header would've required writing more code, and with no real benefit, IMO. The header is short, and there's no big performance concern, so why not use RAF here? But I agree that FileChannel is the preferred way to go for accessing everything past the header. Though those who use RAF will be fine as far as assignment grading is concerned, I'm sure - provided they synchronize for safety. But as I've shown, you probably need to do that with FileChannel too, so this isn't a big difference between the two.
[ July 16, 2003: Message edited by: Jim Yingst ]
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi, S Bala.
Even if we implement a synchronized write operation for our clients, there is no guarantee that, the exisitng custom application for reporting "will not" try to read/write at the same time. We have no way of dealing with this situation.
In my instructions we do have a guarantee, under "Locking":

You may assume that at any moment, at most one program is accessing the database file; therefore your locking system only needs to be concerned with multiple concurrent clients of your server.


Combining this with the section you quoted, it seems that there is another application which may manipulate the files, but it won't be manipulating them at the same time our program is running. So we're safe, according to the instructions.
 
Ranch Hand
Posts: 64
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Jim:
In regard to your statement:

Incidentally, since I used read(ByteBuffer) above - I also believe that FileChannel offers no guarantee that the ByteByffer will be filled, even if the file has more bytes.


I also had this concern. FileChannel's read returns the number of bytes "actually" read. Unfortunately, this depends on what one's definition of actually, actually is. I noticed none of SUN's examples for FileChannle's read method enclose the call in a while loop, so I didn't use a while loop either. But, I'm not sure if looking at the implementation code is justified if the method contract doesn't guarantee the behavior.
I think SUN needs to put a Contract section in their Javadoc that explicitly states what behavior is guaranteed by a method. Fat chance?
Tx
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Jim Yingst:
Howdy, Max.

Actually, you can trust the guarantees, but you have to be careful about reading too much into them.
I agree with this. When I said "the guarantees it offers are not absolute" I meant that the guarantees do not promise everything we might think or hope that they promise, and we must be careful with them. I do believe that the guarantees are obeyed (probably, excepting some possible bugs which will be fixed soon if not already).
For example, a write is atomic, if you don't move the position explicitly, and allow the FileChannel to write to it's next natural position. Jim, this is all on pages 284-285 of my book, so you should know this
I read this, but don't believe it offers the level of security you think it does. We need to provide random access to any record in the file, not just the next record. So either we have to use read/write methods that take an explicit position, or we set the position and then use a read/write that implicitly uses that position. You seem to advocate the latter approach. And it's true that the atomicity of the read/write is guarantted by the API. But what is not guaranteed is that there will be no interruption btween setting the position and performing the read write. That is, if we do
channel.position(100);
channel.read(buffer1);
while another thread does
channel.position(200);
channel.read(buffer2);
we may end up with something like
channel.position(100);
channel.position(200);
channel.read(buffer2);
channel.read(buffer1);


Well, let's think about this a bit. If a given FileChannel is being opened on a Just in time basis(that is, at the method level), they there's no opportunity whatsoever for this sort of conflict. Two threads cannot have access to the same FileChannel, thus the above situation is impossible.



Here buffer2 gets what buffer1 was expecting, and buffer1 gets whatever is after the record at 200. This is no good; we'd need explicit synchronization to prevent interruption between the position() and read() methods.
Alternately, we can use the read/write methods that take an explicit position:
channel.read(buffer1, 100);
channel.read(buffer2, 200);
This works great as far as the reads are concerned. According to the API these methods may even process concurrently, but they're still guaranteed to each read from the appropriate place.
However, what if we have two thread doing:
channel.read(buffer1, 100);
channel.write(buffer2, 100);
According to the API these may also proceed concurrently. That's a potential problem. The read() will read from the correct position, but what it's reading may change underneath it. You may get a read that starts out with data from before the update, and ends with data from after the update. Unless, again, you protect your methods with synchronization.


I think there's a misunderstanding here. Reading changes the position of the of the FileChannel: thus, it's covered under the part of the documentation that says
Only one operation that involves the channel's position or can change its file's size may be in progress at any given time; attempts to initiate a second such operation while the first is still in progress will block until the first operation completes. Other operations, in particular those that take an explicit position, may proceed concurrently; whether they in fact do so is dependent upon the underlying implementation and is therefore unspecified.

Thus, the situation, as described, cannot occur.



I do believe that this type of dirty read is pretty unlikely, especially considering the record legnth is just 183 in my assignment. The system will probably finish each read/write atomically. And it's quite possible that on some particular implementations of FileChannel, this atomicity will always occur. However, it's not guaranteed by the FileChannel spec. So I advocate explicit synchronization if you want to ensure that dirty reads do not occur.
Incidentally, since I used read(ByteBuffer) above - I also believe that FileChannel offers no guarantee that the ByteByffer will be filled, even if the file has more bytes.


Again, this is tricky, because there is such a guarantee, though it's somewhat indirectly stated. As noted above, because other FileChannels block until your FileChannel is finished(remember, your FileChannel is changing position), you're ok here. The trick is to use FileChannels throughout, and not to explicitly change the position.
All best,
M
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Bob Reeves:
Hi Jim:
In regard to your statement:

I also had this concern. FileChannel's read returns the number of bytes "actually" read. Unfortunately, this depends on what one's definition of actually, actually is. I noticed none of SUN's examples for FileChannle's read method enclose the call in a while loop, so I didn't use a while loop either. But, I'm not sure if looking at the implementation code is justified if the method contract doesn't guarantee the behavior.
I think SUN needs to put a Contract section in their Javadoc that explicitly states what behavior is guaranteed by a method. Fat chance?
Tx


Actually, it is in the documentation.

Only one operation that involves the channel's position or can change its file's size may be in progress at any given time; attempts to initiate a second such operation while the first is still in progress will block until the first operation completes. Other operations, in particular those that take an explicit position, may proceed concurrently; whether they in fact do so is dependent upon the underlying implementation and is therefore unspecified.

Because a read changes position, it's covered by the above, so you're ok. Really.
M
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi, Max!
Well, let's think about this a bit. If a given FileChannel is being opened on a Just in time basis(that is, at the method level), they there's no opportunity whatsoever for this sort of conflict. Two threads cannot have access to the same FileChannel, thus the above situation is impossible.
That's fine, I agree - if we're creating separate FileChannels for each thread. But many of us are using a single FileChannel shared between threads - that's the situation I was addressing.
(Actually I'm not sure I know what is or is not guaranteed if two threads open different FileChannels on the same file concurrently. But at least I won't argue the point, for now.)
I think there's a misunderstanding here. Reading changes the position of the of the FileChannel: thus, it's covered under the part of the documentation that says
"Only one operation that involves the channel's position or can change its file's size may be in progress at any given time; attempts to initiate a second such operation while the first is still in progress will block until the first operation completes. Other operations, in particular those that take an explicit position, may proceed concurrently; whether they in fact do so is dependent upon the underlying implementation and is therefore unspecified."
Thus, the situation, as described, cannot occur.

I still disagree; we've got a difference of interpretation for that spec. Which is understandable, IMO, as the wording seems less than clear to me. But look at the last sentence: "Other operations, in particular those that take an explicit position, may proceed concurrently..." What do you think that means? What's an example of an operation which takes an explicit position? What does it mean to proceed concurrently?
Here's my interpretation. FileChannel has a private member variable, position. This can be set with position(long) and queried with position(). When the API refers to "the FileChannel's position", it means the value of this instance variable. There are a number of methods which do not include any position parameter, but whose APIs specify that they implicitly use the member variable:

I will call these implicit-position methods. The APIs for these methods say things like "bytes are read starting at this channel's current file position". The channel's current file position is explicit mentioned in each comment. (It's explicit in the comment, but merely implicit when you just look at the method parameters.)
In contrast, there are two other methods that do contain an explicit position paramenter:

I'll call these explicit-position methods. The API comments for these methods specify that "bytes are read [or written] starting at the given file position rather than at the channel's current position. This method does not modify this channel's position." Thus, I believe these two methods fall under the category of "other operations, in particular those that take an explicit position". And thus, they "may proceed concurrently." Which I take to mean they may be concurrent with each other, or with the other read/write operations that implicitly use the channel's (member variable) position. No two implicit-position methods can proceed concurrently in the same FileChannel, but explicit-position methods can happen anytime. At least, as I interpret the spec.
Because a read changes position, it's covered by the above, so you're ok. Really
That's true for three out of four read methods recently surveyed. But you'll note that the code sample I provided (which Bob was responding to) uses

which does not change the FileChannel's position.
So, why not just use the implicit-position methods? Because they always rely on the assumption that the "current position" is the correct one, and that it hasn't just been reset by some other thread immediately beforehand. If multiple threads have access to a given FileChannel then the only way I see to prevent them from possibly interrupting each other like this is by using synchronization. And if you're going to sync critical operations anyway, then it really doesn't matter much if the operation was atomic or not, does it? So really, I'm at a loss as to why the FileChannel API was written they way it is; the guarantees it does offer are insufficient to be really useful, IMO. I'm perfectly happy doing synchronization myself - I just wish the spec had been clearer.
[ July 21, 2003: Message edited by: Jim Yingst ]
 
Ranch Hand
Posts: 555
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Jim,
Hi Max,
Jim said:

However, what if we have two thread doing:
channel.read(buffer1, 100);
channel.write(buffer2, 100);
According to the API these may also proceed concurrently. That's a potential problem. The read() will read from the correct position, but what it's reading may change underneath it. You may get a read that starts out with data from before the update, and ends with data from after the update. Unless, again, you protect your methods with synchronization.


Max said:

I think there's a misunderstanding here. Reading changes the position of the of the FileChannel: thus, it's covered under the part of the documentation that says
Only one operation that involves the channel's position or can change its file's size may be in progress at any given time; attempts to initiate a second such operation while the first is still in progress will block until the first operation completes. Other operations, in particular those that take an explicit position, may proceed concurrently; whether they in fact do so is dependent upon the underlying implementation and is therefore unspecified.
Thus, the situation, as described, cannot occur.


Specification of this write/read method says:


public abstract int write(ByteBuffer src,
long position)
throws IOException
Writes a sequence of bytes to this channel from the given buffer, starting at the given file position.
This method works in the same manner as the write(ByteBuffer) method, except that bytes are written starting at the given file position rather than at the channel's current position.

This method does not modify this channel's position


So, I beleive Jim is right. The methods (write/read) having position in argument DO NOT CHANGE position in the FileChannel, so the CAN proceed concurrently!
Vlad
 
Bob Reeves
Ranch Hand
Posts: 64
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Max:
This responds to your comment:

Actually, it is in the documentation.


Respectfully, I disagree. My concern is that the method just might not return all the bytes requested in one read, whether another thread is involved or not. I've seen this behavior on Linux (in a single thread application) where a getBytes didn't return all the bytes known to be in the receive buffer. However, if I looped on a test byte count condition, then I obtained all the bytes.
So, I have this worry that if SUN says something like a method returns the number of bytes "actually" read, maybe the programmer should watch out.
Tx
 
Bob Reeves
Ranch Hand
Posts: 64
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi All:
It all comes down to what does

Only one operation that involves the channel's position or can change its file's size may be in progress at any given time; attempts to initiate a second such operation while the first is still in progress will block until the first operation completes. Other operations, in particular those that take an explicit position, may proceed concurrently; whether they in fact do so is dependent upon the underlying implementation and is therefore unspecified.

mean?
I vote with Jim on this (that's a first!). I think the only way the statements are consistent is if the "involves the channel's position" means the setPosition method. Then opertions that "take an explicit position" don't change where the FileChanel is pointed for operations that don't take an explicit position, and so explicit methods proceed unpurturbed. Exactly how isn't all that clear. Now unless the FileChannel's read method is internally sunchronized, a dirty or inconsistent read is possible.
This leads me to conclude that

File channels are safe for use by multiple concurrent threads

means that you can use a FileChannel from multiple threads without worrying about disturbing its internal state unexpectedly. It doesn't not guarantee any resource that the FileChannel opeerates on.
Opinions may vary ...
Tx
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Bob Reeves:
Hi Max:
This responds to your comment:

Respectfully, I disagree. My concern is that the method just might not return all the bytes requested in one read, whether another thread is involved or not. I've seen this behavior on Linux (in a single thread application) where a getBytes didn't return all the bytes known to be in the receive buffer. However, if I looped on a test byte count condition, then I obtained all the bytes.
So, I have this worry that if SUN says something like a method returns the number of bytes "actually" read, maybe the programmer should watch out.
Tx


Hi Bob,
I agree that FileChannel methods that take an explicit position can return less then the number of bytes read. However( and I just want to make sure I'm hearing you correctly), are you saying that a FileChannel.read operation that didn't take a position failed to read in all the available bytes, when the buffer was large enough to hold them, and when there were more bytes to read? If so, it sounds like a bug. The documentation is clear

a file channel cannot read any more bytes than remain in the file. It is guaranteed, however, that if a channel is in blocking mode and there is at least one byte remaining in the buffer then this method will block until at least one byte is read.

Can I see the code?
M
[ July 22, 2003: Message edited by: Max Habibi ]
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
/*

Originally posted by Jim Yingst:
Hi, Max!
That's true for three out of four read methods recently surveyed. But you'll note that the code sample I provided (which Bob was responding to) uses
pos += channel.read(bytes, pos);


Actually, I missed this: I had assumed that we were still talking about methods that used the implicit position. You're right, as I noted, that methods that take an implicit position may process concurrently, if they cannot change position.

Only one operation that involves the channel's position or can change its file's size may be in progress at any given time;

You'll notice that FileChannel.write(byte[],position) can change file size.

If the given position is greater than the file's current size then the file will be grown to accommodate the new bytes; the values of any bytes between the previous end-of-file and the newly-written bytes are unspecified.

Therefore, it's exempt.


which does not change the FileChannel's position.
So, why not just use the implicit-position methods? Because they always rely on the assumption that the "current position" is the correct one, and that it hasn't just been reset by some other thread immediately beforehand. If multiple threads have access to a given FileChannel then the only way I see to prevent them from possibly interrupting each other like this is by using synchronization.


Or by doing a just in time read, per my suggestion . Remember, dragging that FileChannel around with you, even as you're not using it, isn't a much prettier picture then opening it on a as-needed bases. The only argument I can see against it is that it's more efficient not to open/ close a connection,. My answer there is
1. performance is not a consideration in this assignment.
2. You would be hard pressed, I think, to find a way to even measure the differential, either in processing or memory usage, of opening a FileChannel on JIT basis.
3. opening a FileChannel as needed is simpler code, as it avoid synchronization issues.
4. Not maintaining a FileChannel connection when one isn't necessary is more efficient then carrying one around just in case.



And if you're going to sync critical operations anyway, then it really doesn't matter much if the operation was atomic or not, does it? So really, I'm at a loss as to why the FileChannel API was written they way it is; the guarantees it does offer are insufficient to be really useful, IMO. I'm perfectly happy doing synchronization myself - I just wish the spec had been clearer



I disagree. I think they're very useful. .
M
[ July 22, 2003: Message edited by: Max Habibi ]
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Vlad Rabkin:
Hi Jim,
Hi Max,
Jim said:

So, I beleive Jim is right. The methods (write/read) having position in argument DO NOT CHANGE position in the FileChannel, so the CAN proceed concurrently!
Vlad


Not exactly. The write method, even the explicit position one, can modify a file's position. Therefore, it blocks, because the documentation says.

Only one operation that involves the channel's position or can change its file's size may be in progress at any given time; attempts to initiate a second such operation while the first is still in progress will block until the first operation completes.


Further, the documentation for the
int write(ByteBuffer src,long position)
methods says that If the given position is greater than the file's current size then the file will be grown to accommodate the new bytes
Thus, the File position can change with a write, so writes, all of them, block.
Reads, all of them, block too. However, the read(byte[],position) method only blocks for other reads. All of the other FileChannel.read methods block for both reads and writes.
The point here is that there is only one way to get unstable data, and that is to use the FileChannel.read(byte[],position) method in an environment that allows multiple threads to access the same FileChannel. As I recall, you're not doing this, correct?
M
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello again, Max.
You'll notice that FileChannel.write(byte[],position) can change file size.
Good catch - I had overlooked that. So OK, among read/write methods only read(ByteBuffer, long) seems to be eligible to proceed concurrently. Which is still enough to create problems mutiple threads access a FileChannel using that method, if some sort of write() is being performed on the section of the file being read. If the write is changing XXXXXXXXXX to OOOOOOOOOO, the read could theoretically see something like XXXOOOOOOO. It's fairly unlikely, and maybe impossible on many/all platform for all I know. But it's exceedingly annoying to me that the spec leaves this hole in its guarantees.
Reads, all of them, block too. However, the read(byte[],position) method only blocks for other reads.
Eh? Where did that come from? I don't see why read(ByteBuffer, long) (assuming that's what you meant) would need to block for anything. Could be missing something again...
The point here is that there is only one way to get unstable data, and that is to use the FileChannel.read(byte[],position) method in an environment that allows multiple threads to access the same FileChannel.
Well, there are other problems if you use a shared FileChannel using implicit-position methods, because there can be interruptions between setting the position and reading or writing. I gave an example of this earlier in this thread. This won't lead to reading XXXOOOOOOO when XXXXXXXXXX or OOOOOOOOOO is expected - but it might lead to reading YYYYYYYYYY (a different record entirely) instead.
Unless, of course, we use synchronization. Which is actually pretty simple. Or, unless you use separate FileChannels as you prefer.
As I recall, you're not doing this [sharing FileChannels], correct?
Who - Vlad? Dunno, I've lost track. But there are a number of different people in this thread, and I know that some (like myself) aren't currently doing the just-in-time thing. The FileChannel question was first raised in this thread by S Bala, who seemed to be talking about multiple threads accessing a FileChannel, as are a number of the other people in this thread. If you want to talk about how FileChannel can be used safely in the contect of just-in-time FileChannel creation, that's fine - but I don't think this context has been clearly established in this thread for a lot of these comments. Many of the people reading this thread will not assume just-in-time creation unless it's explicitly stated. For those people, I say don't think of FileChannel's operations as "atomic" because while many are, some are not, and there are just enough holes in the system to screw you up if you're not careful. In contrast, syncronization is a simple and easy option, IMO:

For the class containing these methods, only one instance is created for a given file, and all threads use that instance. So "synchronized" in the method declarations makes those methods effectively atomic in accessing the file.
Note that I'm still putting those reads and writes in loops. That's because of the other problem with FileChannel, mentioned in several other threads above. Even if reads / writes are atomic, there's no actual guarantee (that I can find) that a read or write will actually use all the bytes we expect. That is, if there's still space in a buffer, and the file still has unread bytes, a read() may nonetheless return prematurely, without filling the buffer. At least, the API seems to imply this, and offers no guarantee otherwise. Now, I haven't been able to actually observe this problem in testing. Maybe it never actually comes up. I know it comes up with FileInputStreams fairly often, so I tend to assume it's possible here unless guaranteed otherwise. And I would argue that if partial reads/writes are possible, then they actually destroy the safety of just-in-time FileChannels. It doesn't matter if each read() or write() is atomic, if it's possible for the method to complete without delivering all the intended bytes. You can put the method in a loop as I do above - but without synchronization, there's no way to prevent interruption between loop iterations. And even with synchronization, if you're using just-in-time FileChannels then each FileChannel is a separate instance. You'd need to sync on some shared shared monitor. A static variable or some such.
In contrast, if you have one FileChannel shared by many threads, you can simply use synchronization as I do above to guarantee that each read or write will be uninterrupted. Synchronization is not the complex beast often it's made out to be if you just use it. It's true that in some situations synchronization can be a performance bottleneck. But as has been frequently obeserved here, performance need not be a significant concern for this assignment.
I acknowledge that most of my concerns here are very unlikely to manifest as observable problems in this assignment. And if someone wants to bypass synchronization because they consider the risks to be acceptably low for their purposes, that's fine, as an informed decision. But if people believe that there are actual guarantees that this problems cannot occur, that's what I disagree with.
I suspect that some of the omissions in the spec (esp. for partial read/writes) are just sloppy writing. And so maybe we really can all breath easy without syncronization, if only they'd fix the specs.
[ July 22, 2003: Message edited by: Jim Yingst ]
 
Philippe Maquet
Bartender
Posts: 1872
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Max,

The point here is that there is only one way to get unstable data, and that is to use the FileChannel.read(byte[],position) method in an environment that allows multiple threads to access the same FileChannel. As I recall, you're not doing this, correct?


Does it mean that if I have only one thread which performs all writes (they should be blocking, right ?) and multiple threads which use the FileChannel.read(byte[],position), I am OK ?
Thanks,
Phil.
 
Vlad Rabkin
Ranch Hand
Posts: 555
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Max,
Hi Jim,
[Max] You'll notice that FileChannel.write(byte[],position) can change file size.
Well, I thought about that 100 times, but what does it exactly mean???
Let's say write with FileChannel.write(bytebuffer, position), where bytebuffer has only 3 bytes on the beginning of the file (So, I am sure that this operation will not change the size of the file). Theoretically each write can change the size of file, but in practically this one NOT. So, is atomicy guaranteed or not???

Max I beleive you that FileChannel is much more powerfull tool than "old" streams, but as Jim said: the specification is so unclear, that fill myself much comfotable by explicitly synchronizing read/write.
Here is very interesting link from IBM:
http://www-106.ibm.com/developerworks/xml/library/x-wxxm10.html

NIO offers a less abstract API. For example, with Java IO, you need not worry about buffer management but you have no control over it either. NIO gives you more control over buffer management -- by letting you run it! Arguably, it is more efficient but it is also more complex.

Vlad
 
Bob Reeves
Ranch Hand
Posts: 64
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi All:
Not another post on the FileChannel! Sorry, but I want to add one detail.
Vlad's comment about whether writing three bytes to the start of a long file might not cause a block seems reasonable. Not to soapbox, but perhaps it's not good to microread text that's not professionally written. The javadoc for FileChannel says something like any operation that can change the file length will block. It doesn't say "any operation that changes the file" will block. Maybe this is the emphasis the author intended. The purpose might be to increase the performance of write operations by permitting concurrency.
To seek an answer, I decompiled FileChannelImpl (then IOUtil then FileDispatcher). What I found was that all the position methods have a call to the JVM Monitor in FileChannelImpl, but none of the write methods do. Interesting.
Checking IOUtil then FileDispatcher, neither of these embed calls to JVM Monitor either. Interesting.
My conclusion is that the blocking for file length change is in the native code. I'd guess that the native code would block to protect itself, not the user.
Sorry to belabor this point, but I just couldn't resist.
Tx
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Philippe Maquet:
Hi Max,

Does it mean that if I have only one thread which performs all writes (they should be blocking, right ?) and multiple threads which use the FileChannel.read(byte[],position), I am OK ?
Thanks,
Phil.



As far the write operation in concerned, yes.
M
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
/*

Originally posted by Jim Yingst:
[QB]Hello again, Max.
You'll notice that FileChannel.write(byte[],position) can change file size.
Good catch - I had overlooked that. So OK, among read/write methods only read(ByteBuffer, long) seems to be eligible to proceed concurrently. Which is still enough to create problems multiple threads access a FileChannel using that method, if some sort of write() is being performed on the section of the file being read. If the write is changing XXXXXXXXXX to OOOOOOOOOO, the read could theoretically see something like XXXOOOOOOO. It's fairly unlikely, and maybe impossible on many/all platform for all I know. But it's exceedingly annoying to me that the spec leaves this hole in its guarantees.


It depends on how you look at it. It could be a hole, or it could be the one way in which you can get right-now-dammit checking of the state of the file. There are times when that's exactly what you want. I think the wise folks @ Sun wanted to give you a way of checking under the hood @ your own peril, if you wanted to. Think of it this way: wouldn't you be ticked off if there was no way to do this at all? 7/8 of the methods are safe. The last is at your own risk.


Reads, all of them, block too. However, the read(byte[],position) method only blocks for other reads.
Eh? Where did that come from? I don't see why read(ByteBuffer, long) (assuming that's what you meant) would need to block for anything. Could be missing something again...


if you follow the documention into it's obscure depths, you'll end up http://java.sun.com/j2se/1.4.2/docs/api/java/nio/channels/ReadableByteChannel.html


The point here is that there is only one way to get unstable data, and that is to use the FileChannel.read(byte[],position) method in an environment that allows multiple threads to access the same FileChannel.
Well, there are other problems if you use a shared FileChannel using implicit-position methods, because there can be interruptions between setting the position and reading or writing. I gave an example of this earlier in this thread. This won't lead to reading XXXOOOOOOO when XXXXXXXXXX or OOOOOOOOOO is expected - but it might lead to reading YYYYYYYYYY (a different record entirely) instead.


True, but the XXXOOOOOOO point was the one I was clearing up.


Unless, of course, we use synchronization. Which is actually pretty simple. Or, unless you use separate FileChannels as you prefer.
As I recall, you're not doing this [sharing FileChannels], correct?
Who - Vlad? Dunno, I've lost track. But there are a number of different people in this thread, and I know that some (like myself) aren't currently doing the just-in-time thing. The FileChannel question was first raised in this thread by S Bala, who seemed to be talking about multiple threads accessing a FileChannel, as are a number of the other people in this thread. If you want to talk about how FileChannel can be used safely in the contect of just-in-time FileChannel creation, that's fine - but I don't think this context has been clearly established in this thread for a lot of these comments. Many of the people reading this thread will not assume just-in-time creation unless it's explicitly stated.


I thought I did explicitly state it?


For those people, I say don't think of FileChannel's operations as "atomic" because while many are, some are not, and there are just enough holes in the system to screw you up if you're not careful.


I think what's more insidious is the you're creating multiple nested locks by synchronizing of various objects. While efficiency is not a consideration on this project, complexity is huge. I'm trying to help people see some the cost of their decisions. Ok, you want to keep a FileChannel around in case you need it: but, are you aware of the fact that you'll have to synchronize on it in order to use it safely? And did you know that this nests locks, thus increasing complexity and the opportunity for deadlock? If yes, then ok. but if not, then there are other ways.


In contrast, syncronization is a simple and easy option, IMO:


It's simple and easy, but it adds the spectre of nested locks. If you're safe threading advocate, as I am, you tremble at the mere thought of nested locks .


Note that I'm still putting those reads and writes in loops. That's because of the other problem with FileChannel, mentioned in several other threads above. Even if reads / writes are atomic, there's no actual guarantee (that I can find) that a read or write will actually use all the bytes we expect.
That is, if there's still space in a buffer, and the file still has unread bytes, a read() may nonetheless return prematurely, without filling the buffer. At least, the API seems to imply this, and offers no guarantee otherwise.


This is an incorrect statement, as shown http://java.sun.com/j2se/1.4.2/docs/api/java/nio/channels/ReadableByteChannel.html. Only one of the read methods is susceptible to this( potentially; it's arguable that it's not so susceptible).


Now, I haven't been able to actually observe this problem in testing. Maybe it never actually comes up. I know it comes up with FileInputStreams fairly often, so I tend to assume it's possible here unless guaranteed otherwise. And I would argue that if partial reads/writes are possible, then they actually destroy the safety of just-in-time FileChannels. It doesn't matter if each read() or write() is atomic, if it's possible for the method to complete without delivering all the intended bytes. You can put the method in a loop as I do above - but without synchronization, there's no way to prevent interruption between loop iterations.


Partial writes are not possible, and only one read doesn't explicitly guarantee safety.


And even with synchronization, if you're using just-in-time FileChannels then each FileChannel is a separate instance. You'd need to sync on some shared shared monitor. A static variable or some such.


Again, I believe this is incorrect and misleading. Since write block until complete, then there's not need for an extraneous lock.



In contrast, if you have one FileChannel shared by many threads, you can simply use synchronization as I do above to guarantee that each read or write will be uninterrupted. Synchronization is not the complex beast often it's made out to be if you just use it. It's true that in some situations synchronization can be a performance bottleneck. But as has been frequently observed here, performance need not be a significant concern for this assignment.
I acknowledge that most of my concerns here are very unlikely to manifest as observable problems in this assignment. And if someone wants to bypass synchronization because they consider the risks to be acceptably low for their purposes, that's fine, as an informed decision.


Well, let's make sure we've considered all the details here. All writes, and 3/4 of the reads, act atomically, So if you know your API, there's no reason to extraneously synchronize.
Remember, synchronization is not a magic bullet. I can cause more problems then it causes. If you're nesting locks, and one gets swallowed, then you're bankrupt.


But if people believe that there are actual guarantees that this problems cannot occur, that's what I disagree with.


There are guarantees Jim, you just have to read the fine, fine print
All best,
M
[ July 24, 2003: Message edited by: Max Habibi ]
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Bob Reeves:
Hi All:
Not another post on the FileChannel! Sorry, but I want to add one detail.
Vlad's comment about whether writing three bytes to the start of a long file might not cause a block seems reasonable. Not to soapbox, but perhaps it's not good to microread text that's not professionally written. The javadoc for FileChannel says something like any operation that can change the file length will block. It doesn't say "any operation that changes the file" will block. Maybe this is the emphasis the author intended. The purpose might be to increase the performance of write operations by permitting concurrency.
Tx


But, the problem with Vlad point is that the compiler would have to examine your code, and know that you're only doing 3 bytes. Since it can't reasonably be expected to do this, then it has adhere to the specification. That doesn't mean that all methods always do so, but it's important to be clear that the method can't pick and choose when to adhere to spec and when not to.
Also, remember that the Javadoc was written by engineers(good ones) for engineers. I'm guessing that when they say that something functions a certain way, they really believe that it does.
M
[ July 24, 2003: Message edited by: Max Habibi ]
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Bob Reeves:
Hi All:
Not another post on the FileChannel! Sorry, but I want to add one detail.
Vlad's comment about whether writing three bytes to the start of a long file might not cause a block seems reasonable. Not to soapbox, but perhaps it's not good to microread text that's not professionally written. The javadoc for FileChannel says something like any operation that can change the file length will block. It doesn't say "any operation that changes the file" will block. Maybe this is the emphasis the author intended. The purpose might be to increase the performance of write operations by permitting concurrency.
Tx


But, the problem with Vlad point is that the compiler would have to examine your code, and know that you're only doing 3 bytes. Since it can't reasonably be expected to do this, then it has adhere to the specification. That doesn't mean that all methods always do so, but it's important to be clear that the method can't pick and choose when to adhere to spec and when not to.
Also, remember that the Javadoc was written by engineers(good ones) for engineers. I'm guessing that when they say say that something functions a certain way, they really believe that it does.
M
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Vlad Rabkin:
Hi Max,
Hi Jim,
[Max] You'll notice that FileChannel.write(byte[],position) can change file size.
Well, I thought about that 100 times, but what does it exactly mean???
Let's say write with FileChannel.write(bytebuffer, position), where bytebuffer has only 3 bytes on the beginning of the file (So, I am sure that this operation will not change the size of the file). Theoretically each write can change the size of file, but in practically this one NOT. So, is atomicy guaranteed or not???
Vlad


Hi Vlad,
I'm glad you're getting so into this . The atomicy is guaranteed, because the spec says that any operation that may change file size is atomic. The compiler can't examine your code and figure out that you're not really going to be changing file size. When you ask for a writable FileChannel, and you call write on it, it blocks. That's what the language specifications say, anyway. And if you trust then language specs written by Sun, then Sun can't very well fault you if the specs go wrong.
Or think of it this way: If you've never seen this fail(and I'll bet that you haven't), then there's no reason to assume that the spec is anything other then correct. So then, the only discussion point is, does the spec say that it's not atomic? I don't believe so. However, if you see a place where it seems to do so, then you shouldn't use it. Does that seem reasonable?
btw- I tend to take a hyper mathematical/engineering view of these sorts of things. my father is a Ph.D in math, so logical consistency is really important to me(typical dinner conversation around the Habibi household was proving that 1 was not equal to 0). Thus, I'm somewhat overzealous in my logical premisemenship.
M
 
Bob Reeves
Ranch Hand
Posts: 64
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Max:
This responds to your comment:

But, the problem with Vlad point is that the compiler would have to examine your code, and know that you're only doing 3 bytes. Since it can't reasonably be expected to do this, then it has adhere to the specification. That doesn't mean that all methods always do so, but it's important to be clear that the method can't pick and choose when to adhere to spec and when not to.


The motivation for by post was a decompile of FileChannelImpl, which is the type of the obect returned from RandomAccessFile's getChannel.
The decomile of FileChannelImpl shows no embedded JVM Monitor enter/JVM Monitor exit calls in any of the write methods. But all the position methods have them. Now FileChannelImpl calls IOUtil and FileDispatcher. Again neither of these contain Monitor calls in their write methods.
So, I think you can see why I conclude that the block on write must be in the native code. I'd also opinion that native code protects itself on block (ie. so that its file on disk information remains valid), not the user. Thus my conclusion.
Notice there is no customization of the compiled code for short buffers. Of course, the compiler couldn't do that! I think the native code looks at the buffer length, and branches according to file length. (Actually, the only time it must block is if the truncate method is called after the write. I'm guessing truncate isn't called frequently, so it might make sense to optimize this way.)
Tx
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
/*

Originally posted by Bob Reeves:
Hi Max:
This responds to your comment:

The motivation for by post was a decompile of FileChannelImpl, which is the type of the obect returned from RandomAccessFile's getChannel.
The decomile of FileChannelImpl shows no embedded JVM Monitor enter/JVM Monitor exit calls in any of the write methods. But all the position methods have them. Now FileChannelImpl calls IOUtil and FileDispatcher. Again neither of these contain Monitor calls in their write methods.
So, I think you can see why I conclude that the block on write must be in the native code. I'd also opinion that native code protects itself on block (ie. so that its file on disk information remains valid), not the user. Thus my conclusion.
Notice there is no customization of the compiled code for short buffers. Of course, the compiler couldn't do that! I think the native code looks at the buffer length, and branches according to file length. (Actually, the only time it must block is if the truncate method is called after the write. I'm guessing truncate isn't called frequently, so it might make sense to optimize this way.)
Tx



It might very well be in the native code: however, the compiler, when dealing with a method like

can't know beforehand what size of the ByteBuffer will be, nor can it can know, just because you allocated a ByteBuffer, that you'll be writing it. For that matter, in order to know what the actual size of the File was before it started writing, the FileChannel would have to synchronize on the file(in case another FileChannel is writing or deleting from it), in order to make the sort of claim that the language specs makes about writes being atomic: there is no other way for it to make that claim. Thus, the logical sequence must be either
1. lock down writes
2. read current file size
3. write content
4. release lock
or
1. lock down writes
2. read current file size
3. (since this particular write doesn't change the size of the file)
4. release lock
5. write content

Now, that latter example, while it might be optimized, still allows another thread to theoretically sneak in, write over the intended area, and sneak out(between steps 4 and 5). Thus, it allows a hole in which writes are not thread safe. I'm pretty sure that if I can pick this up, the smart fellows at Sun can too
Thus, the FileChannel is forced to lock on writes, since it guarantees exclusivity. Now, if it does so in Native code or not is immaterial. It's still obligated to do so, or be in violation of the spec.
All best,
M
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hola, Max!
[Jim]: Many of the people reading this thread will not assume just-in-time creation unless it's explicitly stated.
[b][Max]: I thought I did explicitly state it?

OK right, yes you did; sorry. The way I read it, when you said "Well, let's think about this a bit. If a given FileChannel is being opened on a Just in time basis..." you seemed to imply that just-in-time FileChannels were what we had been discussing in the first place. Or at least that they were relevant. To me, they weren't - you were bringing up a separate and unrelated topic. If we create separate FileChannels for each thread, the FileChannels are "thread safe" - but that's true for any class isn't it? (Well yes there's a question over whether the underlying file is being updated safely, and I'll concede that FileChannel does seem to handle this where previous java.io classes did not.) I thought we were talking about whether and how a single FileChannel could be accessed by multiple threads, so I was confused by your response. Personally I'm still interested in the one-FileChannel-many-threads question, so I've now created this new topic for that particular aspect of the discusson. For this thread I'm going to ignore just-in-time one-thread-one-FileChannel approaches as irrelevant to the issue I've been talking about. Though I will certainly acknowledge they're a valid alternate approach to the Developer Cert assignment. (I don't entirely understand what guarantees are made for some aspects of their behavior, but that's another dicussion.)
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
[Max]: Reads, all of them, block too. However, the read(byte[],position) method only blocks for other reads.
[Jim]: Eh? Where did that come from? I don't see why read(ByteBuffer, long) (assuming that's what you meant) would need to block for anything. Could be missing something again...
[Max]: if you follow the documention into it's obscure depths, you'll end up ]http://java.sun.com/j2se/1.4.2/docs/api/java/nio/channels/ReadableByteChannel.html

Ah yes, the last paragraph of read()'s description. I had read that before and forgot about it. Thanks. You're right, reads are mutually atomic.
Which is funny for FileChannel since concurrent reads with read(ByteBuffer, long) wouldn't have any effect on each other anyway, which is why I wasn't really paying attention to that part of the spec. I guess ReadableByteChannel's guarantee might be useful for other types of channels, or if I were using the implicit-position reads maybe.
[Jim]: But it's exceedingly annoying to me that the spec leaves this hole in its guarantees.
[Max]: It depends on how you look at it. It could be a hole, or it could be the one way in which you can get right-now-dammit checking of the state of the file. There are times when that's exactly what you want. I think the wise folks @ Sun wanted to give you a way of checking under the hood @ your own peril, if you wanted to. Think of it this way: wouldn't you be ticked off if there was no way to do this at all?

That's true, there may be times when this sort of approach is useful. But if that was their intent here, I think they did a shoddy job of documenting it. If read(ByteBuffer, long) is supposed to be the one exception to the general rule, why not say so in the documentation for that method? The whole section about an "operation that involves the channel's position or can change its file's size" is needlessly vague. Would've been much better to identify which methods really blocked, and which did not. Furthermore, if they wanted to make a special read-now-dammit examption for read(ByteByffer, long), why did they leave intact the restriction about reads blocking each other? Remember, read(ByteBuffer, long) didn't inherit the ReadableByteChannel API for read(ByteBuffer) - they chose to say, in words, that the one method behaved like the other. They could have easily added "except for the fact that read(ByteBuffer, long) need not block while another read operation is taking place. It seems that if Sun really wanted to make a method that just does a read ASAP, they did a poor job of it. As such, I'm not really inclined to assume that what they said == what they meant. If it's possible that the implementation was shoddy, it's also possible the documentation was shoddy; perhaps the implementation is really perfect and they just did a poor job of explaining it. Maybe FileChannels really are thread-safe, but I'm having a hard time seeing that from the API they've provided.
[Max]: 7/8 of the methods are safe. The last is at your own risk.
Maybe. And, if we're trying to write code to access a FileChannel from multiple threads, then 6/8 of those methods are pretty much useless anyway. (Unless you synchronize in which case you provide your own "thread safety".) Because all the implicit-position methods suffer from the fact that they have no control over whether another thread may interrupt in between setting the position() and doing the read() or write(). As shown in my second post above. (Not counting the intrusion into S Bala's post.)
I suppose that there might be some ways to use the implicit-position methods in a multi-thread context without synchronization, if we never use position() at all, and we have mutliple threads doing reads on "whatever is at the current position" and we don't care which thread reads what. There are some applications I can imagine where this might be useful. So OK, replace "pretty much useless" above with "useless for random access to a file".
If you want to do a random-access read on a file without requiring synchronization, then read(ByteBuffer, long) is the one method that you want to be atomic. The other three read methods are no good anyway. So it's particularly annoying that this is the one method that has a non-atomic exception, if there's also a write(ByteBuffer, long) being executed somewhere. Hence my my frustration with the FileChannel API.
[Jim]: In contrast, syncronization is a simple and easy option, IMO:
[Max]: It's simple and easy, but it adds the spectre of nested locks. If you're safe threading advocate, as I am, you tremble at the mere thought of nested locks .

Not really. I make the synchronized blocks nice and small, and they don't call any code that could acquire another lock. And I've now modified the code to sync on a private instance variable (the FileChannel) which no other thread could possibly have an instance. Hence no other thread could have a lock on the FileChannel instance unless they've gone through the read(int) or write(int) method of my class. In which case if another thread is doing a read or write, other threads must wait until the operation is completed and the lock is released; that's the point of the synchronization after all. Note that syncing on the FileChannel has nothing at all to do with the wait/notify protocol used by the lock() method; that's at another level entirely, and it doesn't really matter if a thread is currently holding other locks when they call read(int) or write(int), as long as the FileChannel sync is acquired and released without attempting to acquire any other locks. The only way I can see for deadlock to occur here is for a junior programmer to come in and insert some new code inside that tiny sync blocks of read(int) or write(int) which calls some other method that needs a lock not currently held. Which could happen I suppose, but there's only so much programmer stupidity I feel compelled to defend against. If the junior programmers want to insert lock-requiring code in random places without knowing what they're doing, they can screw up any program I write, regardless of what I do. I content myself with making it so they'd really have to work to create deadlock; I can't make it impossible.
[Jim]: Even if reads / writes are atomic, there's no actual guarantee (that I can find) that a read or write will actually use all the bytes we expect.
That is, if there's still space in a buffer, and the file still has unread bytes, a read() may nonetheless return prematurely, without filling the buffer. At least, the API seems to imply this, and offers no guarantee otherwise.
[Max]: This is an incorrect statement, as shown http://java.sun.com/j2se/1.4.2/docs/api/java/nio/channels/ReadableByteChannel.html.

OK, the write method does have an explicit guarantee; the read does not. (I actually had written the write() without a loop, and the read() with, but I hadn't put in comment on why I did that, so just before posting the code here I had quickly re-edited the write() to use a loop too, for consistency. That's what I get for failing to comment, and then for rereading too quickly later.)
[Max]: Only one of the read methods is susceptible to this( potentially; it's arguable that it's not so susceptible).
Taking the latter part first: I agree that it's possible read() isn't really susceptible to this problem at all. I haven't been able to demonstrate it in various tests I've run. But (a) I'm only testing on Windows 2000 Pro and XP Home at the moment, and (b) testing aside, the spec is still too vague.
As for "only one of the read methods is susceptible" - I disagree. Which one are you referring to? All four ultimately seem to inherit the same behavior specified by ReadableByteChannel's read(ByteBuffer), which does not guarantee how many bytes will be read. We can se what such a guarantee would look like by looking at the write(ByteBuffer) method: " Unless otherwise specified, a write operation will return only after writing all of the r requested bytes." Nice and clear. In contrast, read(ByteBuffer) says " A read operation might not fill the buffer, and in fact it might not read any bytes at all. Whether or not it does so depends upon the nature and state of the channel." Gee, thanks for clarifying. :roll: It also says later "a file channel cannot read any more bytes than remain in the file." True, but that doesn't say it will read bytes just because they do remain in the file. So, where is there some indication that three of the read() methods are immune to partial reads?
----
Lastly, I would be remiss if I didn't note that FileChannel's API does say, nice and clearly, "File channels are safe for use by multiple concurrent threads." Which sure sounds reassuring. If we take this one statement at face value, great, no worries. But careful reading of the other statements made reveals potential problems. "Safe" seems to have a very vague meaning here. For comparison, the Collections.synchronizedList() method (and similar ones) refer to the returned object as "thread-safe" but then go on to point out that you must explicitly synchronize while iterating. They don't say "tread-safe except for...", they just say "thread-safe". So I take "thread-safe" to mean "we thought about thread safety while designing this, and you should read the rest of the API to figure out what you can and can not do." In the case of the Collections class, this is made pretty clear; in the case of FileChannel, it's not.
So in sum: FileChannels are very cool. But I don't trust their claims of thread safety, and talk of the "atomic" nature of FileChannel methods really needs to include quite a few caveats. The simplest way to describe the situation, IMO, is to say that FileChannels are not inherently thread safe, but they work great if you either synchronize access to key methods, or create a separate FileChannel for each thread that wants one.
Or - maybe they are really thread-safe, and we just need someone at Sun to edit the API so that this safety is actually guaranteed. Until that happens though (or until I see convincing arguments otherwise), I'm assuming they're not thread-safe.
[ July 26, 2003: Message edited by: Jim Yingst ]
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
/*


[Max]: Reads, all of them, block too. However, the read(byte[],position) method only blocks for other reads.
[Jim]: Eh? Where did that come from? I don't see why read(ByteBuffer, long) (assuming that's what you meant) would need to block for anything. Could be missing something again...
[Max]: if you follow the documention into it's obscure depths, you'll end up ]http://java.sun.com/j2se/1.4.2/docs/api/java/nio/channels/ReadableByteChannel.html

Ah yes, the last paragraph of read()'s description. I had read that before and forgot about it. Thanks. You're right, reads are mutually atomic.


No problem, the documentation could be more clear.


[Jim]: But it's exceedingly annoying to me that the spec leaves this hole in its guarantees.
[Max]: It depends on how you look at it. It could be a hole, or it could be the one way in which you can get right-now-dammit checking of the state of the file. There are times when that's exactly what you want. I think the wise folks @ Sun wanted to give you a way of checking under the hood @ your own peril, if you wanted to. Think of it this way: wouldn't you be ticked off if there was no way to do this at all?

That's true, there may be times when this sort of approach is useful. But if that was their intent here, I think they did a shoddy job of documenting it.


I can see where you might have this opinion. Certainly the documentation could be more clear.


It seems that if Sun really wanted to make a method that just does a read ASAP, they did a poor job of it.


I'm not sure how you're drawing this conclusion. To say that it's not well documented is one thing: to say that it's badly implemented would require intimate knowledge of the code, which I lack.


As such, I'm not really inclined to assume that what they said == what they meant. If it's possible that the implementation was shoddy, it's also possible the documentation was shoddy; perhaps the implementation is really perfect and they just did a poor job of explaining it. Maybe FileChannels really are thread-safe, but I'm having a hard time seeing that from the API they've provided.
[Max]: 7/8 of the methods are safe. The last is at your own risk.
Maybe. And, if we're trying to write code to access a FileChannel from multiple threads, then 6/8 of those methods are pretty much useless anyway.
(Unless you synchronize in which case you provide your own "thread safety".) Because all the implicit-position methods suffer from the fact that they have no control over whether another thread may interrupt in between setting the position() and doing the read() or write(). As shown in my second post above. (Not counting the intrusion into S Bala's post.)


I think you mean slice in, not interrupt, unless I'm not following you @ all, which is possible. As for thread safety, I think it depends on your needs. Remember, most of Java isn't thread safe: Java assumes that you don't want thread safely unless you explicitly ask for it. FileChannels are thread safe enough, in that you don't get data in an inconsistent state when you read or write. Again, this is completely consistent with everything else in Java, whether it includes IO or not. If you're an Object across several threads and you need thread safety, then you have to provide it yourself. If you don't need that safety, then you don't provide.
The important point with FileChannels is that reads and writes are atomic: thus, you wet call integrity between invocations to these methods. I thought that was your concern?


I suppose that there might be some ways to use the implicit-position methods in a multi-thread context without synchronization, if we never use position() at all, and we have multiple threads doing reads on "whatever is at the current position" and we don't care which thread reads what. There are some applications I can imagine where this might be useful. So OK, replace "pretty much useless" above with "useless for random access to a file".


Make it 'useless if you require 100% clean reads', and it will be a fair statement. Otherwise, the implication is that only clean reads are useful, which is obviously (IMO), false.


If you want to do a random-access read on a file without requiring synchronization, then read(ByteBuffer, long) is the one method that you want to be atomic. The other three read methods are no good anyway.


Why? This only applies if you're doing random access read on a file, and keeping the connection open after the method's needs, and you're sharing the FileChannel across various threads, and you're demanding clean reads at all times. Relatively speaking, it's rare that I've found in this particular set of circumstances.


So it's particularly annoying that this is the one method that has a non-atomic exception, if there's also a write(ByteBuffer, long) being executed somewhere. Hence my frustration with the FileChannel API.


I can understand your frustration, and I can see where you might feel that the documentation could be more robust. However, in fairness, I think they(the Sun guys) did a pretty good job of implementing and documenting FileChannels.


[Jim]: In contrast, syncronization is a simple and easy option, IMO:
[Max]: It's simple and easy, but it adds the spectre of nested locks. If you're safe threading advocate, as I am, you tremble at the mere thought of nested locks .

Not really. I make the synchronized blocks nice and small, and they don't call any code that could acquire another lock.


As I understood it, you're locking on the recno, then you're locking on the FileChannel monitor, correct? Regardless of the size of the second locking block, you're still nesting locks: one for the record, and another for the FileChannel. This is what makes me tremble .



[Jim]: Even if reads / writes are atomic, there's no actual guarantee (that I can find) that a read or write will actually use all the bytes we expect.
That is, if there's still space in a buffer, and the file still has unread bytes, a read() may nonetheless return prematurely, without filling the buffer. At least, the API seems to imply this, and offers no guarantee otherwise.
[Max]: This is an incorrect statement, as shown http://java.sun.com/j2se/1.4.2/docs/api/java/nio/channels/ReadableByteChannel.html.

OK, the write method does have an explicit guarantee; the read does not.


Again, all reads do, except for the one read(ByteBuffer, long). All the reads are fine.


Taking the latter part first: I agree that it's possible read() isn't really susceptible to this problem at all. I haven't been able to demonstrate it in various tests I've run. But (a) I'm only testing on Windows 2000 Pro and XP Home at the moment, and (b) testing aside, the spec is still too vague.


Do you mean the spec on the read(ByteBuffer, long), or other parts of the spec? If the former, I'd have to agree by definition, since an experienced programmer like yourself finds it vague. However, I think I find it less vague then you do. That may simply be because I'm too stupid to be as afraid as I should be


As for "only one of the read methods is susceptible" - I disagree. Which one are you referring to? All four ultimately seem to inherit the same behavior specified by ReadableByteChannel's read(ByteBuffer), which does not guarantee how many bytes will be read. We can se what such a guarantee would look like by looking at the write(ByteBuffer) method: " Unless otherwise specified, a write operation will return only after writing all of the r requested bytes." Nice and clear. In contrast, read(ByteBuffer) says " A read operation might not fill the buffer, and in fact it might not read any bytes at all. Whether or not it does so depends upon the nature and state of the channel." Gee, thanks for clarifying. :roll:


Um, I don't think you read that last part of that, which says.

a file channel cannot read any more bytes than remain in the file. It is guaranteed, however, that if a channel is in blocking mode and there is at least one byte remaining in the buffer then this method will block until at least one byte is read.

http://java.sun.com/j2se/1.4.2/docs/api/java/nio/channels/ReadableByteChannel.html


----
Lastly, I would be remiss if I didn't note that FileChannel's API does say, nice and clearly, "File channels are safe for use by multiple concurrent threads." Which sure sounds reassuring. If we take this one statement at face value, great, no worries. But careful reading of the other statements made reveals potential problems.


Well, see above
Jim, I'm not really sure this is a SCJD discussion anymore(though it is an interesting one), nor am I sure how relevant it is to the forum per se. I'm going to knock of here, but I'll be happy to resume in an IO forum. Or I'll meet you @ the OK Correl
M
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Jim, I'm not really sure this is a SCJD discussion anymore(though it is an interesting one), nor am I sure how relevant it is to the forum per se.
Fair enough. I think it is relevant to anyone sharing a FileChannel among threads, which is a number of people in SCJD - but it's really a general IO question as well. So now that I've extracted the FileChannel thread-safety posts, I'll move these to the IO forum...
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
[Jim]: It seems that if Sun really wanted to make a method that just does a read ASAP, they did a poor job of it.
[Max]: I'm not sure how you're drawing this conclusion. To say that it's not well documented is one thing: to say that it's badly implemented would require intimate knowledge of the code, which I lack.

Well I'm assuming that the implementation does obey the existing documentation. The documentation they provide needlessly prevents mutually concurrent reads, but allows reads concurrent with writes. Why one but not the other, if they were trying to make a fast read? Dunno what the implementation does here - if it obeys the doc, then it's the doc's fault the implementation is not as good as it could've been. Or if it ignores the docs here - well, I kinda approve in this case, because I think these particular docs need to be rewritten anyway. But in principal this is Wrong™; an implementation should obey its public API once that API has been officially released. As long as the API is at least somewhat usable - gotta live with the warts.

[Max]: 7/8 of the methods are safe. The last is at your own risk.
[Jim]: Maybe. And, if we're trying to write code to access a FileChannel from multiple threads, then 6/8 of those methods are pretty much useless anyway. (Unless you synchronize in which case you provide your own "thread safety".) Because all the implicit-position methods suffer from the fact that they have no control over whether another thread may interrupt in between setting the position() and doing the read() or write(). As shown in my second post above. (Not counting the intrusion into S Bala's post.)
[Max]: I think you mean slice in, not interrupt, unless I'm not following you @ all, which is possible.

Slice in is what I meant. Good catch.
[Max]: As for thread safety, I think it depends on your needs. Remember, most of Java isn't thread safe: Java assumes that you don't want thread safely unless you explicitly ask for it.
Right - and if they'd just not mentioned thread safety at all in the FileChannel API, I'd be perfectly happy, because I'd know that meant I had to take care of it myself if I wanted it. What annoys me is when they claim to offer thread safety, and then leave these poorly-documented holes lying around.
[Max]: FileChannels are thread safe enough, in that you don't get data in an inconsistent state when you read or write.
The holes I've been discussing certainly do allow inconsistent data to be read. At least, as far as guarantees offered. If you read a record at the same time as you write it, without explicit synchronization, there's no guarantee that you won't read a mixture of new and old data. Maybe this will never actually happen - but I have no way of knowing that from the API.

[Max]:If you're an Object across several threads and you need thread safety, then you have to provide it yourself. If you don't need that safety, then you don't provide.
The important point with FileChannels is that reads and writes are atomic: thus, you wet call integrity between invocations to these methods. I thought that was your concern?

Umm, "wet call"? Is that a typo, or just a term I haven't heard before?
The problems I've been citing are directly involved with individual reads and writes, being performed by different threads on the same FileObject. That's been my concern all along. And one of my points has been that explicit-position reads and writes are not atomic - not when you mix reads and writes at least. And implicit-position methods may be atomic, but so what, if you can't supply the position in the same atomic method call, the atomicity doesn't extend far enought to really be useful. You get to atomically read or write to/from an unknown position. Whee, sounds like fun. Into the unknown we go... :roll:
[Jim]: I suppose that there might be some ways to use the implicit-position methods in a multi-thread context without synchronization, if we never use position() at all, and we have multiple threads doing reads on "whatever is at the current position" and we don't care which thread reads what. There are some applications I can imagine where this might be useful. So OK, replace "pretty much useless" above with "useless for random access to a file".
[Max]: Make it 'useless if you require 100% clean reads', and it will be a fair statement. Otherwise, the implication is that only clean reads are useful, which is obviously (IMO), false.

Talking about the problems with implicit-positoin methods, the problem is for both reads and writes, because at the instant you invoke write() you don't know if some other thread has just repositioned the channel at the start of a completely different record. (Unless, again, you either synchronize or don't share FileChannel instances, which makes the talk of FileChannel's "thread safety" irrelevant - you provide it yourself.)
I acknowledge that occasional dirty reads may well be perfectly acceptable. (As long as you're aware of the possibility and don't ever use a potentially dirty read as the basis for an update.) Dirty writes would be a big problem however, and they're a distinct possibility if you do unsynchronized writes using implicit-position methods while other threads may be changing the position.

[Jim]: If you want to do a random-access read on a file without requiring synchronization, then read(ByteBuffer, long) is the one method that you want to be atomic. The other three read methods are no good anyway.
[Max]: Why? This only applies if you're doing random access read on a file, and keeping the connection open after the method's needs, and you're sharing the FileChannel across various threads, and you're demanding clean reads at all times. Relatively speaking, it's rare that I've found in this particular set of circumstances.
It was pretty common among people in this thread, except maybe for the "demanding clean reads at all times". That one may be just me. And the idea of "sharing the FileChannel across multiple threads" was implicit in the API when they said "File channels are safe for use by multiple concurrent threads". If sharing FileChannels across threads is some rare exotic idea we're not supposed to consider, why did they bring it up?
And isn't "keeping the connection open after the method's needs" redundant here? We're sharing the FileChannel among threads. So yes, it would probably be a good idea if the channel wasn't closed at the end of each method invocation. That seems implicit in the whole "sharing a FileChannel" idea that you seem to dislike so much.

[Max]: However, in fairness, I think they(the Sun guys) did a pretty good job of implementing and documenting FileChannels.
Well, I haven't found any reason to complain about implementation. It's the documentation that leaves something to be desired, IMO. Certain parts at least. I'm not saying it's all bad, by any means. Maybe I've been spoiled by a lot of the improvements in Java over the years; I expect very high quality nowadays.

[Max]: As I understood it, you're locking on the recno, then you're locking on the FileChannel monitor, correct?
Regardless of the size of the second locking block, you're still nesting locks: one for the record, and another for the FileChannel. This is what makes me tremble

Yes, I'm nesting locks (currently); I just make sure you never call an outer-level lock from within an inner lock. Are you suggesting that deadlock is still possible under those circumstances? Or do you tremble at what junior programmers might do to the code later without understanding? I can sympathize with the latter. If it's the former, could you elaborate on how deadlock is possible?
(Though to be fair, this part of the conversation is heading back towards SCJD, or Threads, so we may spawn yet another new topic if this is an involved issue in its own right.)
I'll also note that, given the fact that I sync on a Record object first, I don't really need the sync on the FileChannel as well, since I've already prevented concurrent access to the same record, and I'm not using implicit-position methods. So I may well do away with these nested locks as unnecessary. But they don't cause me to tremble, no. As you say elsewhere - that may simply be because I'm too stupid to be as afraid as I should be.
[ July 27, 2003: Message edited by: Jim Yingst ]
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Talking about partial reads and writes:
[Jim]: Taking the latter part first: I agree that it's possible read() isn't really susceptible to this problem at all. I haven't been able to demonstrate it in various tests I've run. But (a) I'm only testing on Windows 2000 Pro and XP Home at the moment, and (b) testing aside, the spec is still too vague.
[Max]: Do you mean the spec on the read(ByteBuffer, long), or other parts of the spec?

At this point, I was talking just about the spec for read(ByteBuffer, long). Though I do have gripes about other parts of the spec as detailed elsewhere...
[Max]: If the former, I'd have to agree by definition, since an experienced programmer like yourself finds it vague. However, I think I find it less vague then you do. That may simply be because I'm too stupid to be as afraid as I should be
Well, I wouldn't have said "stupid". Overconfident maybe. And if it's a platform-specific thing, then maybe it's not usually an issue for server-side development. If you're working on a specific server on a specific platform, and you test the heck out of your solution and don't run into any problems with partial reads, then great - it's probably not a problem on that platform. Or it's rare enough that it doesn't matter for your purposes.

[Max]: Um, I don't think you read that last part of that, which says.
a file channel cannot read any more bytes than remain in the file. It is guaranteed, however, that if a channel is in blocking mode and there is at least one byte remaining in the buffer then this method will block until at least one byte is read.

'Course I read it. (Whether I understood it as it was intended is another matter.) There are several problems I see with it. First, your quote omits the first part of the paragraph, which puts things in context a bit more:
[spec]: (1) A read operation might not fill the buffer, and in fact it might not read any bytes at all. (2) Whether or not it does so depends upon the nature and state of the channel. (3) A socket channel in non-blocking mode, for example, cannot read any more bytes than are immediately available from the socket's input buffer; similarly, a file channel cannot read any more bytes than remain in the file. (4) It is guaranteed, however, that if a channel is in blocking mode and there is at least one byte remaining in the buffer then this method will block until at least one byte is read.
I inserted the numbers in front of each sentence. When we look at the full quote, it's clearer that the last part of sentence (3) is not so strongly related to sentence (4). They're giving a number of examples of specific guarantees made by different types of channels; one of those is that a FileChannel can't read more bytes than are available in a file. Which is common sense really, and doesn't tell us anything about the minimum number of bytes read. OK, on to sentence (4), which is not necessarily about FileChannel specifically, but is talking about channels in general. We have:
It is guaranteed, however, that
Sounds good so far; this is the sort of thing I'm looking for.
if a channel is in blocking mode
Eh? Blocking mode? This concept has been defined for SelectableChannels, which have nice isBlocking() and configureBlocking() methods to deal with this property. Unfortunatly FileChannel isn't a SelectableChannel, so that's not relevant. The FileChannel docs do talk about certain methods blocking - except for the ones that don't. :roll: Maybe we're supposed to magically assume that a FileChannel is in "blocking mode" based on this, but I don't really think so. If they wanted to make any guarantees about whether FileChannels block or not, they could have done so (e.g. by making it implement SelectableChannel), but they didn't. I think they intentionally did not do this, probably because some existing native implementations do not block (just a guess, but seems reasonable) and they didn't want to make guarantees they couldn't back up. The issue of blocking generally isn't as important for a FileChannel as it is for a SocketChannel, as delays suffered waiting for a hard drive to read a file are much less significant than delays suffered while waiting for, say, a live user who's typing data to the other end of a socket. We really need nonblocking mode for some socket applcations; for FileChannels it's just "nice to have". But since this rule only talks about channels in blocking mode, it's already irrelevant to FileChannel. Continuing anyway though for the sake of argument, we have...
and there is at least one byte remaining in the buffer
OK, obviously we need space in the buffer to do a proper read. Let's say there are n bytes remaining in the file, and we've for a buffer with space for r bytes, where r >= n. Cool, that means we definitely have space for all the bytes we want...
then this method will block until at least one byte is read.
Um... OK. At least one byte will be read. Great. That gives us a lower limit. We also have two upper limits, r and n, and if r >= n we can say that no more than r bytes will be read. Great. So? Where is there any guarantee that r bytes actually are read?
Compare this to the API for WritableByteChannel's write(ByteBuffer) method, which says ever-so-clearly: "Unless otherwise specified, a write operation will return only after writing all of the r requested bytes." That is a useful guarantee, demonstrating that the java.nio folks do know how to write them if they feel like it. It's instructive to compare the text of the ReadableByeChannel and WritableByteChannel APIs. Many of the same sorts of statements are made, with parallel structures. It seems evident that either one was used as the basis for the other, or the two texts were developed concurrently. Either way, at least one Sun engineer seems to have looked at the text of both, and decided that it was appropriate for WritableByteChannel to have an explicit guarantee against partial writes (unless otherwise stated by a subclass) but not for ReadableByteChannel to have such a guarantee against partial reads. Unlike my gripes about vagueness in some other parts of the docs (blockage of FileChannel methods for example) this one feels to me like it's intentional - like they really meant to not give a guarantee againts partial reads, perhaps because they knew it was possible on some platforms and they didn't have a good way to prevent it. If so, I wish they'd said it a bit more explicitly, rather than simply omitting a definitive statement about FileChannels. Clearly, it's too easy for people to misinterpret what they meant.
[ July 27, 2003: Message edited by: Jim Yingst ]
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Jim Yingst:
[Jim]: It seems that if Sun really wanted to make a method that just does a read ASAP, they did a poor job of it.
[Max]: I'm not sure how you're drawing this conclusion. To say that it's not well documented is one thing: to say that it's badly implemented would require intimate knowledge of the code, which I lack.

Well I'm assuming that the implementation does obey the existing documentation. The documentation they provide needlessly prevents mutually concurrent reads, but allows reads concurrent with writes.


It's not needless, and that's the point. If a read didn't[\i] block, then you could have the possibility that you would get an inconsistent read, in the same sense that updating long(without synchronization) can lead to inconsistent reads.


Why one but not the other, if they were trying to make a fast read? Dunno what the implementation does here - if it obeys the doc, then it's the doc's fault the implementation is not as good as it could've been. Or if it ignores the docs here - well, I kinda approve in this case, because I think these particular docs need to be rewritten anyway. <smile> But in principal this is Wrong™; an implementation should obey its public API once that API has been officially released. As long as the API is at least [i]somewhat
usable - gotta live with the warts. <frown>

[Max]: 7/8 of the methods are safe. The last is at your own risk.
[Jim]: Maybe. And, if we're trying to write code to access a FileChannel from multiple threads, then 6/8 of those methods are pretty much useless anyway. (Unless you synchronize in which case you provide your own "thread safety".) Because all the implicit-position methods suffer from the fact that they have no control over whether another thread may interrupt in between setting the position() and doing the read() or write(). As shown in my second post above. (Not counting the intrusion into S Bala's post.)


Again, I think your post missed a major point, which is that reads and writes are atomic, and thus threadsafe, in exactly the same way that Vectors are threadsafe: no less so, and no more so. That is, the relevant operations (read/ write) are atomic.


[Max]: I think you mean slice in, not interrupt, unless I'm not following you @ all, which is possible.

Slice in is what I meant. Good catch.
[Max]: As for thread safety, I think it depends on your needs. Remember, most of Java isn't thread safe: Java assumes that you don't want thread safely unless you explicitly ask for it.
Right - and if they'd just not mentioned thread safety at all in the FileChannel API, I'd be perfectly happy, because I'd know that meant I had to take care of it myself if I wanted it.


But you would be misinformed: Again, because reads and writes are atomic (excepting the one read method), you don't need to synchronize during the actual operation. For example, if you just want to open a FileChannel and read the entire content of the file.


What annoys me is when they claim to offer thread safety, and then leave these poorly-documented holes lying around.
[Max]: FileChannels are thread safe enough, in that you don't get data in an inconsistent state when you read or write.
The holes I've been discussing certainly do allow inconsistent data to be read. At least, as far as guarantees offered. If you read a record at the same time as you write it, without explicit synchronization, there's no guarantee that you won't read a mixture of new and old data. Maybe this will never actually happen - but I have no way of knowing that from the API.


Can you contrive an example, but using waits? I'd like to see it. I'm not sure it's possible, without exceptional conditions.


[Max]:If you're an Object across several threads and you need thread safety, then you have to provide it yourself. If you don't need that safety, then you don't provide.
The important point with FileChannels is that reads and writes are atomic: thus, you wet call integrity between invocations to these methods. I thought that was your concern?

Umm, "wet call"? Is that a typo, or just a term I haven't heard before?


typo. My Editor's on break


The problems I've been citing are directly involved with individual reads and writes, being performed by different threads on the same FileObject. That's been my concern all along. And one of my points has been that explicit-position reads and writes are not atomic - not when you mix reads and writes at least. And implicit-position methods may be atomic, but so what, if you can't supply the position in the same atomic method call, the atomicity doesn't extend far enough to really be useful. You get to atomically read or write to/from an unknown position. Whee, sounds like fun. Into the unknown we go... rolleyes


I think we're on different pages here. Yes, if you're sharing the same object across multiple threads, then you need to synchronize it. However, an object is defined as 'threadsafe', then you don't have to worry that it will by mucked with my one thread while another has called a threadsafe method on it. This is what it means to have a ThreadSafe object, and this is all it means.


[Jim]: I suppose that there might be some ways to use the implicit-position methods in a multi-thread context without synchronization, if we never use position() at all, and we have multiple threads doing reads on "whatever is at the current position" and we don't care which thread reads what. There are some applications I can imagine where this might be useful. So OK, replace "pretty much useless" above with "useless for random access to a file".
[Max]: Make it 'useless if you require 100% clean reads', and it will be a fair statement. Otherwise, the implication is that only clean reads are useful, which is obviously (IMO), false.

Talking about the problems with implicit-position methods, the problem is for both reads and writes, because at the instant you invoke write() you don't know if some other thread has just repositioned the channel at the start of a completely different record. (Unless, again, you either synchronize or don't share FileChannel instances, which makes the talk of FileChannel's "thread safety" irrelevant - you provide it yourself.)


Right, you can certainly find ways to have threadsafe access without requiring the sort of explicit synchronization you're talking about. Using a private FileChannel in a JIT paradigm is one good way to do that.


I acknowledge that occasional dirty reads may well be perfectly acceptable. (As long as you're aware of the possibility and don't ever use a potentially dirty read as the basis for an update.) Dirty writes would be a big problem however, and they're a distinct possibility if you do unsynchronized writes using implicit-position methods while other threads may be changing the position.


Yes. If you're sharing an object across multiple threads, then you need synchronize access to it. Thus is true with any and all objects, including other 'safe for threading' objects like Vectors. Remember, the alternative (that is, the think you seem to be asking for), is basically a singleton FileChannel that's shared by every thread at all time: that's the only way to get the behavior you seem to be looking for. I don't think that's really what you want?


[Jim]: If you want to do a random-access read on a file without requiring synchronization, then read(ByteBuffer, long) is the one method that you want to be atomic. The other three read methods are no good anyway.
[Max]: Why? This only applies if you're doing random access read on a file, and keeping the connection open after the method's needs, and you're sharing the FileChannel across various threads, and you're demanding clean reads at all times. Relatively speaking, it's rare that I've found in this particular set of circumstances.
It was pretty common among people in this thread, except maybe for the "demanding clean reads at all times". That one may be just me. <smile>


It usually is


And the idea of "sharing the FileChannel across multiple threads" was implicit in the API when they said "File channels are safe for use by multiple concurrent threads". If sharing FileChannels across threads is some rare exotic idea we're not supposed to consider, why did they bring it up? confused


again, I think there's some murky water here. The Java API, when it refers to an Object that 'threadsafe', like Vectors, etc., only means that method calls on that object, unless otherwise specified, behave atomically. It does not mean that the object can be used across multiple threads with impunity. That would be silly


And isn't "keeping the connection open after the method's needs" redundant here? We're sharing the FileChannel among threads. So yes, it would probably be a good idea if the channel wasn't closed at the end of each method invocation. That seems implicit in the whole "sharing a FileChannel" idea that you seem to dislike so much.


I dislike it because in the context of the assignment, if offers no advantage that I can see(neither in speed or memory usage), and introduces the sort of hornet's nest of coding pickles you've found yourself in


[Max]: However, in fairness, I think they(the Sun guys) did a pretty good job of implementing and documenting FileChannels.
Well, I haven't found any reason to complain about implementation. It's the documentation that leaves something to be desired, IMO. Certain parts at least. I'm not saying it's all bad, by any means. Maybe I've been spoiled by a lot of the improvements in Java over the years; I expect very high quality nowadays.


True, I have to say, I really like language.


[Max]: As I understood it, you're locking on the recno, then you're locking on the FileChannel monitor, correct?
Regardless of the size of the second locking block, you're still nesting locks: one for the record, and another for the FileChannel. This is what makes me tremble

Yes, I'm nesting locks (currently); I just make sure you never call an outer-level lock from within an inner lock. Are you suggesting that deadlock is still possible under those circumstances? Or do you tremble at what junior programmers might do to the code later without understanding?


Well, my opinion is that your design choice of sharing a FileChannels across multiple threads offers no tangible advantage, and forces you into positions where you're forced to nest locks.
I'm going to ramble a bit, so you may want to skip this part.
1. There's a story about a famous chess player(Can't remember his name), being asked, before an extremely important game, "how many moves ahead do you think?". his replay was "10 moves". His opponent was asked the same question, and his replay was "1". The opponent's point was that, if you really listen to what the board is telling you, there's only one right move at any given position. if you make the wrong move, the board will tell you that quickly enough, because you'll find yourself in awkward positions, just trying to survive.". I'll leave it to you to decide who won.
2. In Boxing, there's the idea that you never, ever, ever, want to drop your guard, or spread your legs 'too wide'. Now, because I wasn't a talented Boxer, I always, always, always, always kept my guard up, and I never spread my feet too wide. There were a long of talented boxer @ my gym who disregarded this advice, and would drop their guard to rest a bit when there was no possible way for their opponent to reach them. Perhaps they were more talented then I, and simply didn't need to adhere to cannon the way that I did. I can't say.



I can sympathize with the latter. <smile> If it's the former, could you elaborate on how deadlock is possible?
(Though to be fair, this part of the conversation is heading back towards SCJD, or Threads, so we may spawn yet another new topic if this is an involved issue in its own right.)
I'll also note that, given the fact that I sync on a Record object first, I don't really need the sync on the FileChannel as well, since I've already prevented concurrent access to the same record, and I'm not using implicit-position methods. So I may well do away with these nested locks as unnecessary. But they don't cause me to tremble, no. As you say elsewhere - that may simply be because I'm too stupid to be as afraid as I should be.
[ July 27, 2003: Message edited by: Jim Yingst ][/qb]


You? Never. Maybe too talented
All best,
M
[ July 29, 2003: Message edited by: Max Habibi ]
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Jim Yingst:
Talking about partial reads and writes:
[Jim]: Taking the latter part first: I agree that it's possible read() isn't really susceptible to this problem at all. I haven't been able to demonstrate it in various tests I've run. But (a) I'm only testing on Windows 2000 Pro and XP Home at the moment, and (b) testing aside, the spec is still too vague.
[Max]: Do you mean the spec on the read(ByteBuffer, long), or other parts of the spec?

At this point, I was talking just about the spec for read(ByteBuffer, long). Though I do have gripes about other parts of the spec as detailed elsewhere... <smile>


Why am I not surprised


[Max]: If the former, I'd have to agree by definition, since an experienced programmer like yourself finds it vague. However, I think I find it less vague then you do. That may simply be because I'm too stupid to be as afraid as I should be
Well, I wouldn't have said "stupid". Overconfident maybe. And if it's a platform-specific thing, then maybe it's not usually an issue for server-side development. If you're working on a specific server on a specific platform, and you test the heck out of your solution and don't run into any problems with partial reads, then great - it's probably not a problem on that platform. Or it's rare enough that it doesn't matter for your purposes.


I think the difference is that I'm taking the API more literally then you are. You seem to be reading into it a bit, IMO.


[Max]: Um, I don't think you read that last part of that, which says.
a file channel cannot read any more bytes than remain in the file. It is guaranteed, however, that if a channel is in blocking mode and there is at least one byte remaining in the buffer then this method will block until at least one byte is read.

'Course I read it. (Whether I understood it as it was intended is another matter.) There are several problems I see with it. First, your quote omits the first part of the paragraph, which puts things in context a bit more:
[spec]: (1) A read operation might not fill the buffer, and in fact it might not read any bytes at all. (2) Whether or not it does so depends upon the nature and state of the channel. (3) A socket channel in non-blocking mode, for example, cannot read any more bytes than are immediately available from the socket's input buffer; similarly, a file channel cannot read any more bytes than remain in the file. (4) It is guaranteed, however, that if a channel is in blocking mode and there is at least one byte remaining in the buffer then this method will block until at least one byte is read.
I inserted the numbers in front of each sentence. When we look at the full quote, it's clearer that the last part of sentence (3) is not so strongly related to sentence (4). They're giving a number of examples of specific guarantees made by different types of channels; one of those is that a FileChannel can't read more bytes than are available in a file. Which is common sense really, and doesn't tell us anything about the minimum number of bytes read. OK, on to sentence (4), which is not necessarily about FileChannel specifically, but is talking about channels in general. We have:
It is guaranteed, however, that
Sounds good so far; this is the sort of thing I'm looking for.
if a channel is in blocking mode
Eh? Blocking mode?


ah, this may be part of the problem: I may be aware of a definition that you are not. All Channels are blocking, unless otherwise stated. Thus, by definition, a FileChannel is blocking, because it does not state otherwise.


This concept has been defined for SelectableChannels, which have nice isBlocking() and configureBlocking() methods to deal with this property. Unfortunately FileChannel isn't a SelectableChannel, so that's not relevant. The FileChannel docs do talk about certain methods blocking - except for the ones that don't. >rolleyes> Maybe we're supposed to magically assume that a FileChannel is in "blocking mode" based on this, but I don't really think so.


Again, a FileChannel is blocking, be definition. However, your objection are reasonable, given that you apparently were not aware of this.


If they wanted to make any guarantees about whether FileChannels block or not, they could have done so (e.g. by making it implement SelectableChannel), but they didn't. I think they intentionally did not do this, probably because some existing native implementations do not block (just a guess, but seems reasonable) and they didn't want to make guarantees they couldn't back up. The issue of blocking generally isn't as important for a FileChannel as it is for a SocketChannel, as delays suffered waiting for a hard drive to read a file are much less significant than delays suffered while waiting for, say, a live user who's typing data to the other end of a socket. We really need nonblocking mode for some socket applcations; for FileChannels it's just "nice to have". But since this rule only talks about channels in blocking mode, it's already irrelevant to FileChannel. Continuing anyway though for the sake of argument, we have...
and there is at least one byte remaining in the buffer
OK, obviously we need space in the buffer to do a proper read. Let's say there are n bytes remaining in the file, and we've for a buffer with space for r bytes, where r >= n. Cool, that means we definitely have space for all the bytes we want...
then this method will block until at least one byte is read.
Um... OK. At least one byte will be read. Great. That gives us a lower limit. We also have two upper limits, r and n, and if r >= n we can say that no more than r bytes will be read. Great. So? Where is there any guarantee that r bytes actually are read? <confused>


I think your deconstruction is causing some of the confusion. Read it as a whole.
[qb]
It is guaranteed, however, that if a channel is in blocking mode and there is at least one byte remaining in the buffer then this method will block until at least one byte is read.
[qb]

What they're saying is "even if there's only one byte left. Until your threads reads that one byte, it will not return'. You concern is out of place, that is, your n is 1. if this is at least one byte then this method will block until at least one byte is read.. It's a strict if-then relationship here. If there had been Q bytes, then the method would have blocked until at least Q bytes were read in.


Compare this to the API for WritableByteChannel's write(ByteBuffer) method, which says ever-so-clearly: "Unless otherwise specified, a write operation will return only after writing all of the r requested bytes." That is a useful guarantee, demonstrating that the java.nio folks do know how to write them if they feel like it. It's instructive to compare the text of the ReadableByeChannel and WritableByteChannel APIs. Many of the same sorts of statements are made, with parallel structures. It seems evident that either one was used as the basis for the other, or the two texts were developed concurrently. Either way, at least one Sun engineer seems to have looked at the text of both, and decided that it was appropriate for WritableByteChannel to have an explicit guarantee against partial writes (unless otherwise stated by a subclass) but not for ReadableByteChannel to have such a guarantee against partial reads. Unlike my gripes about vagueness in some other parts of the docs (blockage of FileChannel methods for example) this one feels to me like it's intentional - like they really meant to not give a guarantee against partial reads, perhaps because they knew it was possible on some platforms and they didn't have a good way to prevent it. If so, I wish they'd said it a bit more explicitly, rather than simply omitting a definitive statement about FileChannels. Clearly, it's too easy for people to misinterpret what they meant.
[ July 27, 2003: Message edited by: Jim Yingst ]


Writing is a different kettle of cats then reading. With writing, there's no question that you have x bytes to write, and by God, you'll write them. Reading requires more subtlety. If you want to read x + y bytes, but there are only x bytes available, then a decision must be made. Of you want to read x bytes, but there are x+y to be read, then a different outcome occurs. So, of course, it makes sense then the two API would have distinct documentation: they work differently, and one case is more cut and dry then the other.
I suspect that in light of the assertion that FileChannels are by definition blocking, your objections by and large are addressed?
M
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hola, Max! We meet again.
[Max]: I think the difference is that I'm taking the API more literally then you are. You seem to be reading into it a bit, IMO.
<cough> I was thinking rather the opposite...
ah, this may be part of the problem: I may be aware of a definition that you are not. All Channels are blocking, unless otherwise stated. Thus, by definition, a FileChannel is blocking, because it does not state otherwise.
Well it's certainly true I'm not aware of that. Got a reference? The closest I can find is that the close() method of any Channel will block. But that's not the same as being a blocking Channel. For one thing, a nonblocking SelectableChannel will still have a close() that blocks. So close() has nothing to do with whether a Channel as a whole is considered "blocking" or not. Is there some other documentation that's relevant?

[Max]: I think your deconstruction is causing some of the confusion.
Amusing considering I first reconstructed a fuller version fo the quote to repair your own deconstruction. But OK, we'll look again...
[spec]: It is guaranteed, however, that if a channel is in blocking mode and there is at least one byte remaining in the buffer then this method will block until at least one byte is read.
[Max]: What they're saying is "even if there's only one byte left. Until your threads reads that one byte, it will not return'.
Well you added the "even if" yourself, and it (vaguely) implies more than was actually said here. But until the thread reads one byte, it won't return. Right.
if this is at least one byte then this method will block until at least one byte is read.. It's a strict if-then relationship here.
Fine, OK.
If there had been Q bytes, then the method would have blocked until at least Q bytes were read
No, it simply does not say that. You may want to replace "at least one" with "Q", but it's not justified. Rewording it does not make it say something it didn't. Your rewordings may help me understand what you're thinking, but they don't change what the API says. It would have been very easy (in terms of choosing different words, not necessarily implementing in code) for Sun's engineers to make the guarantee you claim they did, but they didn't. They made such a guarantee for write(), they did not for read().
Writing is a different kettle of cats then reading. With writing, there's no question that you have x bytes to write, and by God, you'll write them. Reading requires more subtlety. If you want to read x + y bytes, but there are only x bytes available, then a decision must be made. Of you want to read x bytes, but there are x+y to be read, then a different outcome occurs. So, of course, it makes sense then the two API would have distinct documentation: they work differently, and one case is more cut and dry then the other.
All right, I acknowledge that the read version would have to be a bit more complex. Not much though. The write() version was:
"Unless otherwise specified, a write operation will return only after writing all of the r requested bytes."
A comparable read() version to give the guarantee we both wish were there, would be:
In ReadableByteChannel:
"Unless otherwise specified, a read operation will return only after reading all of the r requested bytes."
In FileChannel:
"A read operation will return only after reading all of the r requested bytes, unless fewer then r bytes remain in the channel. In the latter case the read will block until all remaining bytes in the channel have been read."
It wouldn't have been that difficult to say. No harder than what they did say about "at least one byte".
Incidentally, consider this quote from InputStream's read(byte[]) method:
"If len is zero, then no bytes are read and 0 is returned; otherwise, there is an attempt to read at least one byte. If no byte is available because the stream is at end of file, the value -1 is returned; otherwise, at least one byte is read and stored into b."
Would you argue that this language implies that if Q bytes remain in the stream prior to end of file, and Q <= len, then Q bytes are stored into b? This seems like the same sort of situation as we've been discussing for FileChannel. Except that it's well-known (or if not, easily verifiable) that InputStreams, including FileInputStreams, do not necessarily fill a buffer with all bytes that may occur prior to end-of-stream. (Do we agree on that? I've befinitely observed incomplete reads from FileInputStreams, though not (yet) from FileChannel.) How does the FileChannel API somehow present a more convincing guarantee? I agree that in practice, for the limited platforms I've been able to test one, FileChannel seems much more immune to partial reads than FileInputStream is. And maybe it's always immune. But the API makes no such guarantee.
[Max]: I suspect that in light of the assertion that FileChannels are by definition blocking, your objections by and large are addressed?
Not yet, because (a) I haven't seen any documentation to back up your assertion, and (b) my latter objection about "at least one byte" is completely independent of that argument.
Cheers...
[ July 30, 2003: Message edited by: Jim Yingst ]
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Jim Yingst:
Hola, Max! We meet again.


Ahoy Jim


[Max]: I think the difference is that I'm taking the API more literally then you are. You seem to be reading into it a bit, IMO.
<cough>


lol


I was thinking rather the opposite...

ah, this may be part of the problem: I may be aware of a definition that you are not. All Channels are blocking, unless otherwise stated. Thus, by definition, a FileChannel is blocking, because it does not state otherwise.
Well it's certainly true I'm not aware of that. Got a reference?


I'll look. I ran across it somewhere in the Sun docs as I was doing my initial research. I'll be darned if I can find it now.


The closest I can find is that the close() method of any Channel will block. But that's not the same as being a blocking Channel. For one thing, a nonblocking SelectableChannel will still have a close() that blocks. So close() has nothing to do with whether a Channel as a whole is considered "blocking" or not. Is there some other documentation that's relevant?

[Max]: I think your deconstruction is causing some of the confusion.
Amusing considering I first reconstructed a fuller version fo the quote to repair your own deconstruction. But OK, we'll look again...
[spec]: It is guaranteed, however, that if a channel is in blocking mode and there is at least one byte remaining in the buffer then this method will block until at least one byte is read.
[Max]: What they're saying is "even if there's only one byte left. Until your threads reads that one byte, it will not return'.
Well you added the "even if" yourself, and it (vaguely) implies more than was actually said here. But until the thread reads one byte, it won't return. Right.


I added the whole thing: It's my phrase, not theirs.


if this is at least one byte then this method will block until at least one byte is read.. It's a strict if-then relationship here.
Fine, OK.
If there had been Q bytes, then the method would have blocked until at least Q bytes were read
No, it simply does not say that. You may want to replace "at least one" with "Q", but it's not justified.


Of course it is. If I say.
"I can juggle at least seven cats ", then I'm saying "I can juggle one cat" OR "I can juggle two cats"...OR "I can juggle seven cats".
the last statement "I can juggle seven cats" is a direct logical corollary of "I can juggle at least seven cats".
Correspondingly, if there is one byte then this method will block until at least one byte is read
is direct logical corollary of
if there is at least one byte then this method will block until at least one byte is read
'at Least' implies evening up to an including that point.


A comparable read() version to give the guarantee we both wish were there, would be:
In ReadableByteChannel:
"Unless otherwise specified, a read operation will return only after reading all of the r requested bytes."
In FileChannel:
"A read operation will return only after reading all of the r requested bytes, unless fewer then r bytes remain in the channel. In the latter case the read will block until all remaining bytes in the channel have been read."
It wouldn't have been that difficult to say. No harder than what they did say about "at least one byte".


Well, we all like our own way of saying things best: I know I do . But in this case, I like Sun's too.


Incidentally, consider this quote from InputStream's read(byte[]) method:
"If len is zero, then no bytes are read and 0 is returned; otherwise, there is an attempt to read at least one byte. If no byte is available because the stream is at end of file, the value -1 is returned; otherwise, at least one byte is read and stored into b."
Would you argue that this language implies that if Q bytes remain in the stream prior to end of file, and Q <= len, then Q bytes are stored into b? This seems like the same sort of situation as we've been discussing for FileChannel. Except that it's well-known (or if not, easily verifiable) that InputStreams, including FileInputStreams, do not necessarily fill a buffer with all bytes that may occur prior to end-of-stream. (Do we agree on that? I've befinitely observed incomplete reads from FileInputStreams, though not (yet) from FileChannel.) How does the FileChannel API somehow present a more convincing guarantee? I agree that in practice, for the limited platforms I've been able to test one, FileChannel seems much more immune to partial reads than FileInputStream is. And maybe it's always immune. But the API makes no such guarantee.


<cough> I tend to think that it does, given the docos as interpreted above.


[Max]: I suspect that in light of the assertion that FileChannels are by definition blocking, your objections by and large are addressed?
Not yet, because (a) I haven't seen any documentation to back up your assertion, and (b) my latter objection about "at least one byte" is completely independent of that argument.
Cheers...
[ July 30, 2003: Message edited by: Jim Yingst ][/qb]


All best,
M
[ July 30, 2003: Message edited by: Max Habibi ]
 
I promise I will be the best, most loyal friend ever! All for this tiny ad:
Gift giving made easy with the permaculture playing cards
https://coderanch.com/t/777758/Gift-giving-easy-permaculture-playing
reply
    Bookmark Topic Watch Topic
  • New Topic