• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • Ron McLeod
  • paul wheaton
  • Jeanne Boyarsky
Sheriffs:
  • Paul Clapham
  • Devaka Cooray
Saloon Keepers:
  • Tim Holloway
  • Roland Mueller
  • Himai Minh
Bartenders:

FileChannel and thread safety

 
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
[Jim]: Well you added the "even if" yourself...
[Max]: I added the whole thing: It's my phrase, not theirs.

True; I was focusing on the one part that struck me as most significantly different from the original quote, trying to identify the point of disagreement in our interpretations.
[Max]: If I say. "I can juggle at least seven cats ", then I'm saying "I can juggle one cat" OR "I can juggle two cats"...OR "I can juggle seven cats".
the last statement "I can juggle seven cats" is a direct logical corollary of "I can juggle at least seven cats".

Yup. No problem.
[Max]: Correspondingly, if there is one byte then this method will block until at least one byte is read
is direct logical corollary of
if there is at least one byte then this method will block until at least one byte is read
'at Least' implies evening up to an including that point.

Also agreed. But in the API, "that point" is 1, not 7, or anything greater than 1. (That is, the values read may certainly exceed 1, but the number guaranteed does not..) There are two "at least"'s (how's that for punctuation?) in the API quote above; either or both may be greater than one, but they're never implied to necessarily be equal to each other. They just share the same minimum value, 1.
[Jim]: Incidentally, consider this quote from InputStream's read(byte[]) method:
"If len is zero, then no bytes are read and 0 is returned; otherwise, there is an attempt to read at least one byte. If no byte is available because the stream is at end of file, the value -1 is returned; otherwise, at least one byte is read and stored into b."

[Note - I forgot to add "emphasis mine" the first time I quoted this.]
[Jim]: Would you argue that this language implies that if Q bytes remain in the stream prior to end of file, and Q <= len, then Q bytes are stored into b? This seems like the same sort of situation as we've been discussing for FileChannel. Except that it's well-known (or if not, easily verifiable) that InputStreams, including FileInputStreams, do not necessarily fill a buffer with all bytes that may occur prior to end-of-stream. (Do we agree on that? I've befinitely observed incomplete reads from FileInputStreams, though not (yet) from FileChannel.) How does the FileChannel API somehow present a more convincing guarantee? I agree that in practice, for the limited platforms I've been able to test one, FileChannel seems much more immune to partial reads than FileInputStream is. And maybe it's always immune. But the API makes no such guarantee.
[Max]: <cough> I tend to think that it does, given the docos as interpreted above.

Eh, sounds like you too have caught that bug that's been going around. My sympathies; I know it can be frustrating.
What is your opinion then of the InputStream docs cited above? Specifically:
  • Does the InputStream read(byte[]) API imply that if Q bytes remain in a file, a FileInputStream read(byte[]) is guaranteed to block until all Q bytes are read? (Assuming the buffer is large enough?)
  • Do you think that a FileInputStream actually does block (always, that is) until all Q bytes are read?

  • My apologies if this seems irrelevant to the FileChannel discussion, but the "at least" parts of the InputStream doc seem comparable to the point of our earlier dicsussion. Certainly other parts of the InputStream docs are different, and I don't plan to get into an involved discussion of the InputStream docs as well - but if your answer to question 1 above is "no", I'd like to get at least an idea why you think the situation for FileInputStream is different fom FileChannel in this respect. Referring to documentation, that is - I know that the implementations are significantly different under the hood.
    Cheers...
     
    town drunk
    ( and author)
    Posts: 4118
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator

    Originally posted by Jim Yingst:
    [Max]: Correspondingly, if there is one byte then this method will block until at least one byte is read
    is direct logical corollary of
    if there is at least one byte then this method will block until at least one byte is read
    'at Least' implies evening up to an including that point.

    Also agreed. But in the API, "that point" is 1, not 7, or anything greater than 1. (That is, the values read may certainly exceed 1, but the number guaranteed does not..) There are two "at least"'s (how's that for punctuation?) in the API quote above; either or both may be greater than one, but they're never implied to necessarily be equal to each other. They just share the same minimum value, 1.


    Think of it this way:

    It is guaranteed, however, that if a channel is in blocking mode and there is phrase remaining in the buffer then this method will block until phrase is read.

    That phrase indicates a number. In this case, any number greater then or equal to one. That includes 1,7,99,or a billion.
    Or lets try proof by induction.
    Base Case:
    if there is one byte remaining, then there is also at least one byte remaining, logically speaking(of course). Now, we know that if there is at least one byte remaining, then the method will block until that one byte is read. This is per the definition in the spec. This is the trival case.
    Next Case:
    if there are two bytes remaining, then the method will block(miminally) until one byte is read in. However, at that point, there is still one byte remaining: therefore, the method will block until that byte is read in.
    N+1 Case:
    If there are N+1 bytes remaining, then one byte will be read in, reducing the number of remaining bytes to N. However, at this point, there are still N bytes remaining, and N qualifies as >=1, so the method will continue to block until at least one byte is read in. Minimally, say only single bytes are read in. That still leave N-1 bytes, and so on, until we reach the Base Case.
    Of course, this is a very literal interpretation of the spec: it may not be what Sun meant, but it's the way I read them.


    [Jim]: Incidentally, consider this quote from InputStream's read(byte[]) method:
    "If len is zero, then no bytes are read and 0 is returned; otherwise, there is an attempt to read at least one byte. If no byte is available because the stream is at end of file, the value -1 is returned; otherwise, at least one byte is read and stored into b."

    [Note - I forgot to add "emphasis mine" the first time I quoted this.]
    [Jim]: Would you argue that this language implies that if Q bytes remain in the stream prior to end of file, and Q <= len, then Q bytes are stored into b? This seems like the same sort of situation as we've been discussing for FileChannel.


    I don't think so, because the phrasing of the above does not establish any relationship between number of bytes that are being read in to the number of bytes remaining in the buffer: they are just both greater or equal to one: they have no relationship to each other. That's a critical difference. In my mind, anyway.


    [Max]: <cough> I tend to think that it does, given the docos as interpreted above.[/b]
    Eh, sounds like you too have caught that bug that's been going around. My sympathies; I know it can be frustrating.


    I laughed for about 10 minutes on this: people seemed concerned .we really need to get together and drink


    What is your opinion then of the InputStream docs cited above? Specifically:

  • Does the InputStream read(byte[]) API imply that if Q bytes remain in a file, a FileInputStream read(byte[]) is guaranteed to block until all Q bytes are read? (Assuming the buffer is large enough?)
  • Do you think that a FileInputStream actually does block (always, that is) until all Q bytes are read?


  • No, I don't. If anything, I would say that the FileInputStream could use better documentation.


    My apologies if this seems irrelevant to the FileChannel discussion, but the "at least" parts of the InputStream doc seem comparable to the point of our earlier dicsussion.


    Not at all: it's an important litmus test, and valid point of contention. However, I don't believe that the two are really saying the same thing, though I can certainly see where that's a reasonable interpretation.


    Certainly other parts of the InputStream docs are different, and I don't plan to get into an involved discussion of the InputStream docs as well - but if your answer to question 1 above is "no", I'd like to get at least an idea why you think the situation for FileInputStream is different fom FileChannel in this respect. Referring to documentation, that is - I know that the implementations are significantly different under the hood.
    Cheers...


    Did I explain my perspective clearly above?
    All best,
    M
    [ July 31, 2003: Message edited by: Max Habibi ]
    [ July 31, 2003: Message edited by: Max Habibi ]
     
    Jim Yingst
    Wanderer
    Posts: 18671
    • Mark post as helpful
    • send pies
      Number of slices to send:
      Optional 'thank-you' note:
    • Quote
    • Report post to moderator
    Ciao Max!
    [Max]: Did I explain my perspective clearly above?
    Well we're making progress, thanks. And I'm no longer doubting your sanity, as I was starting to. But I've still got some questions and comments of course...

    [Jim]: What is your opinion then of the InputStream docs cited above? Specifically:
  • Does the InputStream read(byte[]) API imply that if Q bytes remain in a file, a FileInputStream read(byte[]) is guaranteed to block until all Q bytes are read? (Assuming the buffer is large enough?)
  • Do you think that a FileInputStream actually does block (always, that is) until all Q bytes are read?

  • [Max]: No, I don't. If anything, I would say that the FileInputStream could use better documentation.

    I'm shocked that you would suggest such a thing and sully the reputation of Sun's engineers, Max. So what's your answer to (1) specifically? Yes, and the documentation is wrong? No? Maybe? I still think the "at least one" parts of the InputStream API are just as eligible to be replaced with "Q" as the "at least one" parts of the ReadableByteChannel API. The relevant sentences are:
    ReadableByteChannel: "It is guaranteed, however, that if a channel is in blocking mode and there is at least one byte remaining in the buffer then this method will block until at least one byte is read."
    InputStream: "If len is zero, then no bytes are read and 0 is returned; otherwise, there is an attempt to read at least one byte. If no byte is available because the stream is at end of file, the value -1 is returned; otherwise, at least one byte is read and stored into b." (Emphasis mine).
    Asked if these are equivalent, you say:
    [Max]: I don't think so, because the phrasing of the above does not establish any relationship between number of bytes that are being read in to the number of bytes remaining in the buffer: they are just both greater or equal to one: they have no relationship to each other. That's a critical difference. In my mind, anyway.
    What about ReadableByteChannel's phrasing implies a link? Why doesn't that apply to InputStream? Physical proximity of the two "at least" phrases maybe? That's the only difference I see, and to me it seems insufficient to justify interpreting them differently. AmI missing something else?
    [Max]: Or lets try proof by induction.
    OK cool, this does lead us somewhere new...
    [Max]: Base Case:
    if there is one byte remaining, then there is also at least one byte remaining, logically speaking(of course). Now, we know that if there is at least one byte remaining, then the method will block until that one byte is read. This is per the definition in the spec. This is the trival case.

    Agreed.
    [Max]: Next Case:
    if there are two bytes remaining, then the method will block(miminally) until one byte is read in.

    OK...
    [Max] However, at that point, there is still one byte remaining: therefore, the method will block until that byte is read in.
    That's our critical point of divergence. I'll concede that if this point were true, the rest of your argmuent would follow. (Well, except that I still don't think a FileChannel is a blocking channel, but that's a separate point.) I believe the difference in our perspective here is this: when we look at
    "if a channel is in blocking mode and there is at least one byte remaining in the buffer then this method will block until at least one byte is read."
    I see that as
    "if a channel is in blocking mode and there is at least one byte remaining in the buffer at the time this method is first invoked then this method will block until at least one byte is read."
    and you see it as
    "if a channel is in blocking mode and there is at least one byte remaining in the buffer at any point during the execution of this method then this method will block until at least one byte is read."
    Is that correct? The original quote doesn't explicitly tell us which is intended; I suppose either is possible. To me my assumption seems "obvious" but for you it's probably the other way around. OK. I think that the problem with your interpretation is that if we insert that sort of assumption into other places in Sun's API's, we get nonsense. E.g. look at the doc for ArrayList's removeRange(int, int) method. After the method removes some of the elements, but before it returns, aren't there now other elements which have moved into the range that is to be removed? Shouldn't these be removed too? Doesn't this mean removeRange() should just remove everyting after the fromIndex? My answer is of course no - the API only intends to refer to elements whose indices at the time the method was first invoked were within the range [toIndex, fromIndex). I think this is probably a general principle throughout the APIs - they talk about conditions at the beginning of a method call, and at the end, but not necessarily in between, unless they specifically say otherwise. If we start reapplying conditional statements during method executions, as values of instance of variables change, we can probably find all sorts of strange "loopholes" in the APIs; I think that down this road lies madness.
    And I reiterate that I think that if Sun had wanted us to interpret the phrases the way you are doing, they could have found much less ambiguous ways to do it. I think saying "at least one", twice, was done with the specific intent of establishing that those are (or can be) two different quantities. If they were supposed to be the same, then why not "if a channel is in blocking mode and there is at least one byte remaining in the buffer then this method will block until that number of bytes are read."? This is exactly equivalent if we accept your proof by induction. But wait, they can't do that because this doesn't allow for the possibility that the file might not have enough bytes to fill the buffer? Well, yeah. That's further evidence that this interpretation is flawed, because it contradicts what we already know - that if a file does not have enough bytes before the end of the file, then the two "at least one"'s cannot be equal.
    Actually though a trace of this last problem would remain even under my interpretation. Consider a FileChannel to a zero-length file. What does read(ByteBuffer) (assuming the buffer has at least one byte available? Well common sense is that it will return immediately, with a return value of 0 indicating no bytes read. However look at the spec again:
    [ReadableByteChannel]: A read operation might not fill the buffer, and in fact it might not read any bytes at all. Whether or not it does so depends upon the nature and state of the channel. A socket channel in non-blocking mode, for example, cannot read any more bytes than are immediately available from the socket's input buffer; similarly, a file channel cannot read any more bytes than remain in the file. It is guaranteed, however, that if a channel is in blocking mode and there is at least one byte remaining in the buffer then this method will block until at least one byte is read.
    Now sure, they do explicitly address the fact that a FileChannel will not try to read bytes that aren't there. But the phrase "it is guaranteed, however" seems to imply that the subsquent rule is supposed to be true, guaranteed, even in the fact of the situations just discussed. So, if there's at least one byte in the buffer, how can the FileChannel avoid blocking until at least one byte is read? Without violating the spec? (Regardless of whether the two "at least"'s refer to the same number or not.) My answer is that the only way to avoid logical inconsistency here is if, as I previously asserted, FileChannel is not considered a blocking channel. The final guarantee is not relevant to FileChannel; there's no obligation to block at all. There, problem solved.
    [Max]: we really need to get together and drink
    Agreed. That's the problem with virtual friendships - it's not so easy to go drinking.
    [ August 02, 2003: Message edited by: Jim Yingst ]
     
    reply
      Bookmark Topic Watch Topic
    • New Topic