• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Tim Cooke
  • paul wheaton
  • Paul Clapham
  • Ron McLeod
Sheriffs:
  • Jeanne Boyarsky
  • Liutauras Vilda
Saloon Keepers:
  • Tim Holloway
  • Carey Brown
  • Roland Mueller
  • Piet Souris
Bartenders:

NX: URLyBird 1.1.3

 
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I recently downloaded the exam and became a member of this site to get some tips to get started. Here are couple of my questions, any comment is appreciated.
1) Is there a particular reason, why FileChannel class need to be used while reading and writing from/to the db file. I saw some postings on this site using it.
2) Do we need to use a prime number algorithm to generate a lockCookie of type long.
thanks.
 
Ranch Hand
Posts: 64
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi S.B.:
I'm also doing the URLyBird assignment. I used FileChannel because (1) it's (sort of) thread safe , (2) it uses ByteBuffer for data transfer, and (3) it's supposed to have performance advantages.
For the cookie, I didn't use a prime generator. For the most part, I don't think we're being tested on security issues. Just document your considerations and your design choices. I treated the cookie like a password; and passwords don't have to be unique or necessarily follow any other guidelines except size.
Opinions may vary ...
Tx
 
S Bala
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks, Bob.
 
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I just used java.math.Random and the nextLong() method to generate each key as needed.
 
Bartender
Posts: 1872
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Bob and Jim,
Here are what they tell about unlocking in the URLyBird instructions :

// Releases the lock on a record. Cookie must be the cookie
// returned when the record was locked; otherwise throws SecurityException.
public void unlock(long recNo, long cookie)
throws SecurityException;


It means that if your lock method does not garantee that cookies it serves are unique, you take the risk that unlock will unduly throw a SecurityException.
Jim:

I just used java.math.Random and the nextLong() method to generate each key as needed.


In another thread, I asked you :

How do you make sure that two clients do not receive a same cookie ?


and you replied :

I don't. If someone wants to use their cookie to try to unlock some other record that was locked by another client, well, there's a 1 in 2^64 chance that it will work. That's low enough for me. And consider - if someone is writing client code to try to unlock other clients' locked records, why not just use a cookie value of 0, or 1, or 124356789L, or just generate a random long of their own? You get the same 1 in 2^64 chance. Eliminating cookies that have already been used doesn't change this much, IMO.


And you added :

but realistically, security isn't a big concern for this assignment


Security not, but bugs well ! And this is one IMHO. I don't agree with your reasoning there : each time you generate such a random cookie, you take the risk (BTW it's not 1 in 2^64 but current_number_of_locks in 2^64) that a kind client (I mean not hostile) will get into trouble when unlocking. I just agree with you as far as hypothetical hostile clients are concerned.
Same comment for Bob's quote :

For the cookie, I didn't use a prime generator. For the most part, I don't think we're being tested on security issues. Just document your considerations and your design choices. I treated the cookie like a password; and passwords don't have to be unique or necessarily follow any other guidelines except size.


Ok, it's not a big issue, but why to take the risk while it's so simple (and more efficient) to make them unique ?
I generate them now in sequence from LONG_MIN_VALUE, and in the very improbable case the system serves more than 2^64 locks before the server restarts ( ), I wrap the sequence back to LONG_MIN_VALUE.
Cheers,
Philippe.
[ July 16, 2003: Message edited by: Philippe Maquet ]
 
Philippe Maquet
Bartender
Posts: 1872
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Jim,
Sorry, there is no bug in your solution : it's just a 1 in 2 ^ 64 less secure one as you mention it. I just understood it right now (probably thanks to some background thread still thinking to it).
Anyway, increasing a long by 1 should still be faster than querying a random. But as performance is not an issue in this assignment ...
Cheers,
Phil.
 
S Bala
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,
I was reading on FileChannel and its abiliy to support thread safe operations.
URLuBird 1.1.3 says that we may assume there is only one program calling the database at any time.
Does that mean that we can do away with synchronized read and write operations (ignore), and worry about only the concurrency on the server.
thanks
 
Bob Reeves
Ranch Hand
Posts: 64
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Philippe:
Let me explain why I disagree with you. It's important to be clear than the Lock Manager won't dispense multiple cookies to the same record -- the Lock Manager blocks any lock request that arrives while the record is already locked. So, there is no cookie contention for a record, unless it's malicious or a bad actor. So, at any time only one cookie is valid, and it's guaranteed the owner can use or unlock the record (ignoring lock timeouts) with the cookie. (The cookie assigned after a cookie timeout should be different since the old owner thinks it has a valid cookie, so I do use a random generator.)
The cookie is used as a record reservation feature so the client can lock-read for edit-then update the record in a kind of transaction. A non-unique cookie permits that.
The only time (guaranteed) unique cookies are important is if you're protecting against breakers or for timeouts.
Anyway, that's how I see things ...
Tx
 
Philippe Maquet
Bartender
Posts: 1872
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Bob,
There are some days where I shouldn't write anything and keep asleep ...
So it will be the second time that I disagree with myself is this thread, but you are right, 200% right !
Thank you to have pointed that out.
Regards,
Phil. (a bit ashamed )
 
Philippe Maquet
Bartender
Posts: 1872
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi S, (?)
About FileChannels :

Does that mean that we can do away with synchronized read and write operations (ignore),


Reading this in the FileChannel doc :

Other operations, in particular those that take an explicit position, may proceed concurrently; whether they in fact do so is dependent upon the underlying implementation and is therefore unspecified.


I would be cautious (as I'll be now in this thread with anything I may write ). In fact, I am not sure about the correct interpretation to be given to that doc excerpt.
As all my write operations are queued and handled by a dedicated thread, my design is not concerned by such a potential thread safety issue, and I didn't go into deeply.

and worry about only the concurrency on the server.


Right.
Cheers,
Phil.
 
S Bala
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Philippe,
I was reading on FileChannel to see what it had to offer regarding thread safe file operations.
But, my concern (3rd posting) is to whether I can ignore altogether threading issues with the database -- As my assignment states that "I can assume that only one program is accessing the database at any time". So the client checks the record lock in memory for update and delete operations, obtains it and does a transaction with the database. Hence no need to synchronize the read/write operations.
thanks,
SB
 
Philippe Maquet
Bartender
Posts: 1872
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi S,

As my assignment states that "I can assume that only one program is accessing the database at any time". So the client checks the record lock in memory for update and delete operations, obtains it and does a transaction with the database. Hence no need to synchronize the read/write operations.


IMO, "only one program" means "only your application". But how may you assume that two of your clients will not write concurrently in your database (on different records) ? Of course, if FileChannel handles that concurrency well (using an explicit position in the read/write methods), it seems correct.
Cheers,
Philippe.
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Greetings, everyone. Once more into the breach...
Sorry, there is no bug in your solution : it's just a 1 in 2 ^ 64 less secure one as you mention it.
Consider also: if we were guarding against a malicious client (running his own code and connecting to the server) he's actually have a much easier time guessing other people's cookies, if they're generated by incrementing a long by one. When he finds a record he wants to unlock, all he needs to do is lock any other record, get the cookie, and then start trying cookie values, counting down from the value he just received. In comparison the 1 in 2^64 chance is much more secure.
To be fair, if we had to guard against malicious clients, there are a number of other things they could do to screw up the system far worse. E.g. edit/delete records that aren't locked. Or lock all available records indefinitely (unless a timeout has been implemented.) So perhaps it's not practical to worry about malicious clients here unless we have a number of other changes to the system.
Practically speaking, I think the only real use of these cookies is for bug detection. Specifically, if a programmer writes code that accidentally has a client accessing a record already locked by another client, then the invalid cookie will detect the error. Well, 99.9999999999999999946% of the time in my case. Good enough for me. Correctly-written non-malicious code should never get a SecurityException, I think. So for me, cookies are useless except as an extra test of correctness. And except for the fact that the API requires them. :roll:
was reading on FileChannel and its abiliy to support thread safe operations.
URLuBird 1.1.3 says that we may assume there is only one program calling the database at any time. Does that mean that we can do away with synchronized read and write operations (ignore), and worry about only the concurrency on the server.

No, if you use synchronization at all, it's as a defense against other threads in the same JVM - meaning, other clients of your DB server. So depending on your design you may or may not synchronize reads and writes - but this desition has nothing to do with other programs; just this program.
Philippe is correct to be wary of FileChannel's thread safety, IMO - the guarantees it offers are not absolute. Two threads could write to the same section of file at the same time, or one read while another is writing, and these could lead to some strange results. However your locking mechanism should protest against most of this, without additional synchronization. It should be impossible (thanks to record locking) for two threads to update the same record simultaneously. So the only possible problem is direty reads - what if one thread reads a record while another writes? The record which is read may be corrupted, a mix of old data with new. The chance of this is very small really, and it's probably not a big deal since it only affects the record displayed on the search screen - the real record remains intact in the DB file. However if you do wish to eliminate this possibility, you would need to synchronize both read and write methods, IMO. Or perhaps, create some alternate scheme. I think many people ignore the possibility of dirty reads and are not penalized for it.
 
Philippe Maquet
Bartender
Posts: 1872
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Jim,

When he finds a record he wants to unlock, all he needs to do is lock any other record, get the cookie, and then start trying cookie values, counting down from the value he just received. In comparison the 1 in 2^64 chance is much more secure.


Right one more time ...

Correctly-written non-malicious code should never get a SecurityException, I think. So for me, cookies are useless except as an extra test of correctness. And except for the fact that the API requires them.


Yes !

Philippe is correct to be wary of FileChannel's thread safety, IMO


Phew !
Regards,
Phil.

 
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Jim Yingst:
Greetings, everyone. Once more into the Philippe is correct to be wary of FileChannel's thread safety, IMO - the guarantees it offers are not absolute.


Actually, you can trust the guarantees, but you have to be careful about reading too much into them. For example, a write is atomic, if you don't move the position explicitly, and allow the FileChannel to write to it's next natural position. Jim, this is all on pages 284-285 of my book, so you should know this
But seriously, I suggest that you use FileChannel at every turn: they're the cats' meow, as far as Java File IO is concerned. And what better to start digging into them then now?
M
M
M
 
S Bala
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Jim, Philippe.
There is a sentence in the assignment which can help put a different perpective.
-- quote---
because the data must continue to be manipulated for reports using another custom-written application, the new system must reimplement the database code from scratch without altering the data file format
---- end quote---

Even if we implement a synchronized write operation for our clients, there is no guarantee that, the exisitng custom application for reporting "will not" try to read/write at the same time. We have no way of dealing with this situation.
Does this validate my earlier assumption?
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Howdy, Max.
Actually, you can trust the guarantees, but you have to be careful about reading too much into them.
I agree with this. When I said "the guarantees it offers are not absolute" I meant that the guarantees do not promise everything we might think or hope that they promise, and we must be careful with them. I do believe that the guarantees are obeyed (probably, excepting some possible bugs which will be fixed soon if not already).
For example, a write is atomic, if you don't move the position explicitly, and allow the FileChannel to write to it's next natural position. Jim, this is all on pages 284-285 of my book, so you should know this
I read this, but don't believe it offers the level of security you think it does. We need to provide random access to any record in the file, not just the next record. So either we have to use read/write methods that take an explicit position, or we set the position and then use a read/write that implicitly uses that position. You seem to advocate the latter approach. And it's true that the atomicity of the read/write is guarantted by the API. But what is not guaranteed is that there will be no interruption btween setting the position and performing the read write. That is, if we do
channel.position(100);
channel.read(buffer1);
while another thread does
channel.position(200);
channel.read(buffer2);
we may end up with something like
channel.position(100);
channel.position(200);
channel.read(buffer2);
channel.read(buffer1);
Here buffer2 gets what buffer1 was expecting, and buffer1 gets whatever is after the record at 200. This is no good; we'd need explicit synchronization to prevent interruption between the position() and read() methods.
Alternately, we can use the read/write methods that take an explicit position:
channel.read(buffer1, 100);
channel.read(buffer2, 200);
This works great as far as the reads are concerned. According to the API these methods may even process concurrently, but they're still guaranteed to each read from the appropriate place.
However, what if we have two thread doing:
channel.read(buffer1, 100);
channel.write(buffer2, 100);
According to the API these may also proceed concurrently. That's a potential problem. The read() will read from the correct position, but what it's reading may change underneath it. You may get a read that starts out with data from before the update, and ends with data from after the update. Unless, again, you protect your methods with synchronization.
I do believe that this type of dirty read is pretty unlikely, especially considering the record legnth is just 183 in my assignment. The system will probably finish each read/write atomically. And it's quite possible that on some particular implementations of FileChannel, this atomicity will always occur. However, it's not guaranteed by the FileChannel spec. So I advocate explicit synchronization if you want to ensure that dirty reads do not occur.
Incidentally, since I used read(ByteBuffer) above - I also believe that FileChannel offers no guarantee that the ByteByffer will be filled, even if the file has more bytes. If we had a SelectebleChannel in blocking mode we'd have a guarantee, but FileChannel does not offer such a guarantee. So to ensure that the ByteBuffer is filled, we really must use a loop for each read():

This is annoying because again, it's ver unlikely that we'll have a problem for records of as small as ours. And some implementations may well never have this problem. But, the FileChannel API makes no guarantee. This isn't new really - all the InputStream and Reader classes had the same issue, with the exception of a few specific methods like BufferedReader's readLine() and RAF's readFully(). But annoying nonetheless, and frequently overlooked. Even by famous book authors.
But seriously, I suggest that you use FileChannel at every turn: they're the cats' meow, as far as Java File IO is concerned. And what better to start digging into them then now?
I agree. Except that the header for the Contractor assignment has a format that is such an exact match for RAF spec (or more generally, the DataInput interface) that it seems a shame not to take advantage of that. Using FileChannel for the header would've required writing more code, and with no real benefit, IMO. The header is short, and there's no big performance concern, so why not use RAF here? But I agree that FileChannel is the preferred way to go for accessing everything past the header. Though those who use RAF will be fine as far as assignment grading is concerned, I'm sure - provided they synchronize for safety. But as I've shown, you probably need to do that with FileChannel too, so this isn't a big difference between the two.
[ July 16, 2003: Message edited by: Jim Yingst ]
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi, S Bala.
Even if we implement a synchronized write operation for our clients, there is no guarantee that, the exisitng custom application for reporting "will not" try to read/write at the same time. We have no way of dealing with this situation.
In my instructions we do have a guarantee, under "Locking":

You may assume that at any moment, at most one program is accessing the database file; therefore your locking system only needs to be concerned with multiple concurrent clients of your server.


Combining this with the section you quoted, it seems that there is another application which may manipulate the files, but it won't be manipulating them at the same time our program is running. So we're safe, according to the instructions.
 
Bob Reeves
Ranch Hand
Posts: 64
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Jim:
In regard to your statement:

Incidentally, since I used read(ByteBuffer) above - I also believe that FileChannel offers no guarantee that the ByteByffer will be filled, even if the file has more bytes.


I also had this concern. FileChannel's read returns the number of bytes "actually" read. Unfortunately, this depends on what one's definition of actually, actually is. I noticed none of SUN's examples for FileChannle's read method enclose the call in a while loop, so I didn't use a while loop either. But, I'm not sure if looking at the implementation code is justified if the method contract doesn't guarantee the behavior.
I think SUN needs to put a Contract section in their Javadoc that explicitly states what behavior is guaranteed by a method. Fat chance?
Tx
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Jim Yingst:
Howdy, Max.

Actually, you can trust the guarantees, but you have to be careful about reading too much into them.
I agree with this. When I said "the guarantees it offers are not absolute" I meant that the guarantees do not promise everything we might think or hope that they promise, and we must be careful with them. I do believe that the guarantees are obeyed (probably, excepting some possible bugs which will be fixed soon if not already).
For example, a write is atomic, if you don't move the position explicitly, and allow the FileChannel to write to it's next natural position. Jim, this is all on pages 284-285 of my book, so you should know this
I read this, but don't believe it offers the level of security you think it does. We need to provide random access to any record in the file, not just the next record. So either we have to use read/write methods that take an explicit position, or we set the position and then use a read/write that implicitly uses that position. You seem to advocate the latter approach. And it's true that the atomicity of the read/write is guarantted by the API. But what is not guaranteed is that there will be no interruption btween setting the position and performing the read write. That is, if we do
channel.position(100);
channel.read(buffer1);
while another thread does
channel.position(200);
channel.read(buffer2);
we may end up with something like
channel.position(100);
channel.position(200);
channel.read(buffer2);
channel.read(buffer1);


Well, let's think about this a bit. If a given FileChannel is being opened on a Just in time basis(that is, at the method level), they there's no opportunity whatsoever for this sort of conflict. Two threads cannot have access to the same FileChannel, thus the above situation is impossible.



Here buffer2 gets what buffer1 was expecting, and buffer1 gets whatever is after the record at 200. This is no good; we'd need explicit synchronization to prevent interruption between the position() and read() methods.
Alternately, we can use the read/write methods that take an explicit position:
channel.read(buffer1, 100);
channel.read(buffer2, 200);
This works great as far as the reads are concerned. According to the API these methods may even process concurrently, but they're still guaranteed to each read from the appropriate place.
However, what if we have two thread doing:
channel.read(buffer1, 100);
channel.write(buffer2, 100);
According to the API these may also proceed concurrently. That's a potential problem. The read() will read from the correct position, but what it's reading may change underneath it. You may get a read that starts out with data from before the update, and ends with data from after the update. Unless, again, you protect your methods with synchronization.


I think there's a misunderstanding here. Reading changes the position of the of the FileChannel: thus, it's covered under the part of the documentation that says
Only one operation that involves the channel's position or can change its file's size may be in progress at any given time; attempts to initiate a second such operation while the first is still in progress will block until the first operation completes. Other operations, in particular those that take an explicit position, may proceed concurrently; whether they in fact do so is dependent upon the underlying implementation and is therefore unspecified.

Thus, the situation, as described, cannot occur.



I do believe that this type of dirty read is pretty unlikely, especially considering the record legnth is just 183 in my assignment. The system will probably finish each read/write atomically. And it's quite possible that on some particular implementations of FileChannel, this atomicity will always occur. However, it's not guaranteed by the FileChannel spec. So I advocate explicit synchronization if you want to ensure that dirty reads do not occur.
Incidentally, since I used read(ByteBuffer) above - I also believe that FileChannel offers no guarantee that the ByteByffer will be filled, even if the file has more bytes.


Again, this is tricky, because there is such a guarantee, though it's somewhat indirectly stated. As noted above, because other FileChannels block until your FileChannel is finished(remember, your FileChannel is changing position), you're ok here. The trick is to use FileChannels throughout, and not to explicitly change the position.
All best,
M
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Bob Reeves:
Hi Jim:
In regard to your statement:

I also had this concern. FileChannel's read returns the number of bytes "actually" read. Unfortunately, this depends on what one's definition of actually, actually is. I noticed none of SUN's examples for FileChannle's read method enclose the call in a while loop, so I didn't use a while loop either. But, I'm not sure if looking at the implementation code is justified if the method contract doesn't guarantee the behavior.
I think SUN needs to put a Contract section in their Javadoc that explicitly states what behavior is guaranteed by a method. Fat chance?
Tx


Actually, it is in the documentation.

Only one operation that involves the channel's position or can change its file's size may be in progress at any given time; attempts to initiate a second such operation while the first is still in progress will block until the first operation completes. Other operations, in particular those that take an explicit position, may proceed concurrently; whether they in fact do so is dependent upon the underlying implementation and is therefore unspecified.

Because a read changes position, it's covered by the above, so you're ok. Really.
M
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi, Max!
Well, let's think about this a bit. If a given FileChannel is being opened on a Just in time basis(that is, at the method level), they there's no opportunity whatsoever for this sort of conflict. Two threads cannot have access to the same FileChannel, thus the above situation is impossible.
That's fine, I agree - if we're creating separate FileChannels for each thread. But many of us are using a single FileChannel shared between threads - that's the situation I was addressing.
(Actually I'm not sure I know what is or is not guaranteed if two threads open different FileChannels on the same file concurrently. But at least I won't argue the point, for now.)
I think there's a misunderstanding here. Reading changes the position of the of the FileChannel: thus, it's covered under the part of the documentation that says
"Only one operation that involves the channel's position or can change its file's size may be in progress at any given time; attempts to initiate a second such operation while the first is still in progress will block until the first operation completes. Other operations, in particular those that take an explicit position, may proceed concurrently; whether they in fact do so is dependent upon the underlying implementation and is therefore unspecified."
Thus, the situation, as described, cannot occur.

I still disagree; we've got a difference of interpretation for that spec. Which is understandable, IMO, as the wording seems less than clear to me. But look at the last sentence: "Other operations, in particular those that take an explicit position, may proceed concurrently..." What do you think that means? What's an example of an operation which takes an explicit position? What does it mean to proceed concurrently?
Here's my interpretation. FileChannel has a private member variable, position. This can be set with position(long) and queried with position(). When the API refers to "the FileChannel's position", it means the value of this instance variable. There are a number of methods which do not include any position parameter, but whose APIs specify that they implicitly use the member variable:

I will call these implicit-position methods. The APIs for these methods say things like "bytes are read starting at this channel's current file position". The channel's current file position is explicit mentioned in each comment. (It's explicit in the comment, but merely implicit when you just look at the method parameters.)
In contrast, there are two other methods that do contain an explicit position paramenter:

I'll call these explicit-position methods. The API comments for these methods specify that "bytes are read [or written] starting at the given file position rather than at the channel's current position. This method does not modify this channel's position." Thus, I believe these two methods fall under the category of "other operations, in particular those that take an explicit position". And thus, they "may proceed concurrently." Which I take to mean they may be concurrent with each other, or with the other read/write operations that implicitly use the channel's (member variable) position. No two implicit-position methods can proceed concurrently in the same FileChannel, but explicit-position methods can happen anytime. At least, as I interpret the spec.
Because a read changes position, it's covered by the above, so you're ok. Really
That's true for three out of four read methods recently surveyed. But you'll note that the code sample I provided (which Bob was responding to) uses

which does not change the FileChannel's position.
So, why not just use the implicit-position methods? Because they always rely on the assumption that the "current position" is the correct one, and that it hasn't just been reset by some other thread immediately beforehand. If multiple threads have access to a given FileChannel then the only way I see to prevent them from possibly interrupting each other like this is by using synchronization. And if you're going to sync critical operations anyway, then it really doesn't matter much if the operation was atomic or not, does it? So really, I'm at a loss as to why the FileChannel API was written they way it is; the guarantees it does offer are insufficient to be really useful, IMO. I'm perfectly happy doing synchronization myself - I just wish the spec had been clearer.
[ July 21, 2003: Message edited by: Jim Yingst ]
 
Ranch Hand
Posts: 555
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Jim,
Hi Max,
Jim said:

However, what if we have two thread doing:
channel.read(buffer1, 100);
channel.write(buffer2, 100);
According to the API these may also proceed concurrently. That's a potential problem. The read() will read from the correct position, but what it's reading may change underneath it. You may get a read that starts out with data from before the update, and ends with data from after the update. Unless, again, you protect your methods with synchronization.


Max said:

I think there's a misunderstanding here. Reading changes the position of the of the FileChannel: thus, it's covered under the part of the documentation that says
Only one operation that involves the channel's position or can change its file's size may be in progress at any given time; attempts to initiate a second such operation while the first is still in progress will block until the first operation completes. Other operations, in particular those that take an explicit position, may proceed concurrently; whether they in fact do so is dependent upon the underlying implementation and is therefore unspecified.
Thus, the situation, as described, cannot occur.


Specification of this write/read method says:


public abstract int write(ByteBuffer src,
long position)
throws IOException
Writes a sequence of bytes to this channel from the given buffer, starting at the given file position.
This method works in the same manner as the write(ByteBuffer) method, except that bytes are written starting at the given file position rather than at the channel's current position.

This method does not modify this channel's position


So, I beleive Jim is right. The methods (write/read) having position in argument DO NOT CHANGE position in the FileChannel, so the CAN proceed concurrently!
Vlad
 
Bob Reeves
Ranch Hand
Posts: 64
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Max:
This responds to your comment:

Actually, it is in the documentation.


Respectfully, I disagree. My concern is that the method just might not return all the bytes requested in one read, whether another thread is involved or not. I've seen this behavior on Linux (in a single thread application) where a getBytes didn't return all the bytes known to be in the receive buffer. However, if I looped on a test byte count condition, then I obtained all the bytes.
So, I have this worry that if SUN says something like a method returns the number of bytes "actually" read, maybe the programmer should watch out.
Tx
 
Bob Reeves
Ranch Hand
Posts: 64
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi All:
It all comes down to what does

Only one operation that involves the channel's position or can change its file's size may be in progress at any given time; attempts to initiate a second such operation while the first is still in progress will block until the first operation completes. Other operations, in particular those that take an explicit position, may proceed concurrently; whether they in fact do so is dependent upon the underlying implementation and is therefore unspecified.

mean?
I vote with Jim on this (that's a first!). I think the only way the statements are consistent is if the "involves the channel's position" means the setPosition method. Then opertions that "take an explicit position" don't change where the FileChanel is pointed for operations that don't take an explicit position, and so explicit methods proceed unpurturbed. Exactly how isn't all that clear. Now unless the FileChannel's read method is internally sunchronized, a dirty or inconsistent read is possible.
This leads me to conclude that

File channels are safe for use by multiple concurrent threads

means that you can use a FileChannel from multiple threads without worrying about disturbing its internal state unexpectedly. It doesn't not guarantee any resource that the FileChannel opeerates on.
Opinions may vary ...
Tx
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Bob Reeves:
Hi Max:
This responds to your comment:

Respectfully, I disagree. My concern is that the method just might not return all the bytes requested in one read, whether another thread is involved or not. I've seen this behavior on Linux (in a single thread application) where a getBytes didn't return all the bytes known to be in the receive buffer. However, if I looped on a test byte count condition, then I obtained all the bytes.
So, I have this worry that if SUN says something like a method returns the number of bytes "actually" read, maybe the programmer should watch out.
Tx


Hi Bob,
I agree that FileChannel methods that take an explicit position can return less then the number of bytes read. However( and I just want to make sure I'm hearing you correctly), are you saying that a FileChannel.read operation that didn't take a position failed to read in all the available bytes, when the buffer was large enough to hold them, and when there were more bytes to read? If so, it sounds like a bug. The documentation is clear

a file channel cannot read any more bytes than remain in the file. It is guaranteed, however, that if a channel is in blocking mode and there is at least one byte remaining in the buffer then this method will block until at least one byte is read.

Can I see the code?
M
[ July 22, 2003: Message edited by: Max Habibi ]
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
/*

Originally posted by Jim Yingst:
Hi, Max!
That's true for three out of four read methods recently surveyed. But you'll note that the code sample I provided (which Bob was responding to) uses
pos += channel.read(bytes, pos);


Actually, I missed this: I had assumed that we were still talking about methods that used the implicit position. You're right, as I noted, that methods that take an implicit position may process concurrently, if they cannot change position.

Only one operation that involves the channel's position or can change its file's size may be in progress at any given time;

You'll notice that FileChannel.write(byte[],position) can change file size.

If the given position is greater than the file's current size then the file will be grown to accommodate the new bytes; the values of any bytes between the previous end-of-file and the newly-written bytes are unspecified.

Therefore, it's exempt.


which does not change the FileChannel's position.
So, why not just use the implicit-position methods? Because they always rely on the assumption that the "current position" is the correct one, and that it hasn't just been reset by some other thread immediately beforehand. If multiple threads have access to a given FileChannel then the only way I see to prevent them from possibly interrupting each other like this is by using synchronization.


Or by doing a just in time read, per my suggestion . Remember, dragging that FileChannel around with you, even as you're not using it, isn't a much prettier picture then opening it on a as-needed bases. The only argument I can see against it is that it's more efficient not to open/ close a connection,. My answer there is
1. performance is not a consideration in this assignment.
2. You would be hard pressed, I think, to find a way to even measure the differential, either in processing or memory usage, of opening a FileChannel on JIT basis.
3. opening a FileChannel as needed is simpler code, as it avoid synchronization issues.
4. Not maintaining a FileChannel connection when one isn't necessary is more efficient then carrying one around just in case.



And if you're going to sync critical operations anyway, then it really doesn't matter much if the operation was atomic or not, does it? So really, I'm at a loss as to why the FileChannel API was written they way it is; the guarantees it does offer are insufficient to be really useful, IMO. I'm perfectly happy doing synchronization myself - I just wish the spec had been clearer



I disagree. I think they're very useful. .
M
[ July 22, 2003: Message edited by: Max Habibi ]
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Vlad Rabkin:
Hi Jim,
Hi Max,
Jim said:

So, I beleive Jim is right. The methods (write/read) having position in argument DO NOT CHANGE position in the FileChannel, so the CAN proceed concurrently!
Vlad


Not exactly. The write method, even the explicit position one, can modify a file's position. Therefore, it blocks, because the documentation says.

Only one operation that involves the channel's position or can change its file's size may be in progress at any given time; attempts to initiate a second such operation while the first is still in progress will block until the first operation completes.


Further, the documentation for the
int write(ByteBuffer src,long position)
methods says that If the given position is greater than the file's current size then the file will be grown to accommodate the new bytes
Thus, the File position can change with a write, so writes, all of them, block.
Reads, all of them, block too. However, the read(byte[],position) method only blocks for other reads. All of the other FileChannel.read methods block for both reads and writes.
The point here is that there is only one way to get unstable data, and that is to use the FileChannel.read(byte[],position) method in an environment that allows multiple threads to access the same FileChannel. As I recall, you're not doing this, correct?
M
 
Jim Yingst
Wanderer
Posts: 18671
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello again, Max.
You'll notice that FileChannel.write(byte[],position) can change file size.
Good catch - I had overlooked that. So OK, among read/write methods only read(ByteBuffer, long) seems to be eligible to proceed concurrently. Which is still enough to create problems mutiple threads access a FileChannel using that method, if some sort of write() is being performed on the section of the file being read. If the write is changing XXXXXXXXXX to OOOOOOOOOO, the read could theoretically see something like XXXOOOOOOO. It's fairly unlikely, and maybe impossible on many/all platform for all I know. But it's exceedingly annoying to me that the spec leaves this hole in its guarantees.
Reads, all of them, block too. However, the read(byte[],position) method only blocks for other reads.
Eh? Where did that come from? I don't see why read(ByteBuffer, long) (assuming that's what you meant) would need to block for anything. Could be missing something again...
The point here is that there is only one way to get unstable data, and that is to use the FileChannel.read(byte[],position) method in an environment that allows multiple threads to access the same FileChannel.
Well, there are other problems if you use a shared FileChannel using implicit-position methods, because there can be interruptions between setting the position and reading or writing. I gave an example of this earlier in this thread. This won't lead to reading XXXOOOOOOO when XXXXXXXXXX or OOOOOOOOOO is expected - but it might lead to reading YYYYYYYYYY (a different record entirely) instead.
Unless, of course, we use synchronization. Which is actually pretty simple. Or, unless you use separate FileChannels as you prefer.
As I recall, you're not doing this [sharing FileChannels], correct?
Who - Vlad? Dunno, I've lost track. But there are a number of different people in this thread, and I know that some (like myself) aren't currently doing the just-in-time thing. The FileChannel question was first raised in this thread by S Bala, who seemed to be talking about multiple threads accessing a FileChannel, as are a number of the other people in this thread. If you want to talk about how FileChannel can be used safely in the contect of just-in-time FileChannel creation, that's fine - but I don't think this context has been clearly established in this thread for a lot of these comments. Many of the people reading this thread will not assume just-in-time creation unless it's explicitly stated. For those people, I say don't think of FileChannel's operations as "atomic" because while many are, some are not, and there are just enough holes in the system to screw you up if you're not careful. In contrast, syncronization is a simple and easy option, IMO:

For the class containing these methods, only one instance is created for a given file, and all threads use that instance. So "synchronized" in the method declarations makes those methods effectively atomic in accessing the file.
Note that I'm still putting those reads and writes in loops. That's because of the other problem with FileChannel, mentioned in several other threads above. Even if reads / writes are atomic, there's no actual guarantee (that I can find) that a read or write will actually use all the bytes we expect. That is, if there's still space in a buffer, and the file still has unread bytes, a read() may nonetheless return prematurely, without filling the buffer. At least, the API seems to imply this, and offers no guarantee otherwise. Now, I haven't been able to actually observe this problem in testing. Maybe it never actually comes up. I know it comes up with FileInputStreams fairly often, so I tend to assume it's possible here unless guaranteed otherwise. And I would argue that if partial reads/writes are possible, then they actually destroy the safety of just-in-time FileChannels. It doesn't matter if each read() or write() is atomic, if it's possible for the method to complete without delivering all the intended bytes. You can put the method in a loop as I do above - but without synchronization, there's no way to prevent interruption between loop iterations. And even with synchronization, if you're using just-in-time FileChannels then each FileChannel is a separate instance. You'd need to sync on some shared shared monitor. A static variable or some such.
In contrast, if you have one FileChannel shared by many threads, you can simply use synchronization as I do above to guarantee that each read or write will be uninterrupted. Synchronization is not the complex beast often it's made out to be if you just use it. It's true that in some situations synchronization can be a performance bottleneck. But as has been frequently obeserved here, performance need not be a significant concern for this assignment.
I acknowledge that most of my concerns here are very unlikely to manifest as observable problems in this assignment. And if someone wants to bypass synchronization because they consider the risks to be acceptably low for their purposes, that's fine, as an informed decision. But if people believe that there are actual guarantees that this problems cannot occur, that's what I disagree with.
I suspect that some of the omissions in the spec (esp. for partial read/writes) are just sloppy writing. And so maybe we really can all breath easy without syncronization, if only they'd fix the specs.
[ July 22, 2003: Message edited by: Jim Yingst ]
 
Philippe Maquet
Bartender
Posts: 1872
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Max,

The point here is that there is only one way to get unstable data, and that is to use the FileChannel.read(byte[],position) method in an environment that allows multiple threads to access the same FileChannel. As I recall, you're not doing this, correct?


Does it mean that if I have only one thread which performs all writes (they should be blocking, right ?) and multiple threads which use the FileChannel.read(byte[],position), I am OK ?
Thanks,
Phil.
 
Vlad Rabkin
Ranch Hand
Posts: 555
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Max,
Hi Jim,
[Max] You'll notice that FileChannel.write(byte[],position) can change file size.
Well, I thought about that 100 times, but what does it exactly mean???
Let's say write with FileChannel.write(bytebuffer, position), where bytebuffer has only 3 bytes on the beginning of the file (So, I am sure that this operation will not change the size of the file). Theoretically each write can change the size of file, but in practically this one NOT. So, is atomicy guaranteed or not???

Max I beleive you that FileChannel is much more powerfull tool than "old" streams, but as Jim said: the specification is so unclear, that fill myself much comfotable by explicitly synchronizing read/write.
Here is very interesting link from IBM:
http://www-106.ibm.com/developerworks/xml/library/x-wxxm10.html

NIO offers a less abstract API. For example, with Java IO, you need not worry about buffer management but you have no control over it either. NIO gives you more control over buffer management -- by letting you run it! Arguably, it is more efficient but it is also more complex.

Vlad
 
Bob Reeves
Ranch Hand
Posts: 64
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi All:
Not another post on the FileChannel! Sorry, but I want to add one detail.
Vlad's comment about whether writing three bytes to the start of a long file might not cause a block seems reasonable. Not to soapbox, but perhaps it's not good to microread text that's not professionally written. The javadoc for FileChannel says something like any operation that can change the file length will block. It doesn't say "any operation that changes the file" will block. Maybe this is the emphasis the author intended. The purpose might be to increase the performance of write operations by permitting concurrency.
To seek an answer, I decompiled FileChannelImpl (then IOUtil then FileDispatcher). What I found was that all the position methods have a call to the JVM Monitor in FileChannelImpl, but none of the write methods do. Interesting.
Checking IOUtil then FileDispatcher, neither of these embed calls to JVM Monitor either. Interesting.
My conclusion is that the blocking for file length change is in the native code. I'd guess that the native code would block to protect itself, not the user.
Sorry to belabor this point, but I just couldn't resist.
Tx
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Philippe Maquet:
Hi Max,

Does it mean that if I have only one thread which performs all writes (they should be blocking, right ?) and multiple threads which use the FileChannel.read(byte[],position), I am OK ?
Thanks,
Phil.



As far the write operation in concerned, yes.
M
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
/*

Originally posted by Jim Yingst:
[QB]Hello again, Max.
You'll notice that FileChannel.write(byte[],position) can change file size.
Good catch - I had overlooked that. So OK, among read/write methods only read(ByteBuffer, long) seems to be eligible to proceed concurrently. Which is still enough to create problems multiple threads access a FileChannel using that method, if some sort of write() is being performed on the section of the file being read. If the write is changing XXXXXXXXXX to OOOOOOOOOO, the read could theoretically see something like XXXOOOOOOO. It's fairly unlikely, and maybe impossible on many/all platform for all I know. But it's exceedingly annoying to me that the spec leaves this hole in its guarantees.


It depends on how you look at it. It could be a hole, or it could be the one way in which you can get right-now-dammit checking of the state of the file. There are times when that's exactly what you want. I think the wise folks @ Sun wanted to give you a way of checking under the hood @ your own peril, if you wanted to. Think of it this way: wouldn't you be ticked off if there was no way to do this at all? 7/8 of the methods are safe. The last is at your own risk.


Reads, all of them, block too. However, the read(byte[],position) method only blocks for other reads.
Eh? Where did that come from? I don't see why read(ByteBuffer, long) (assuming that's what you meant) would need to block for anything. Could be missing something again...


if you follow the documention into it's obscure depths, you'll end up http://java.sun.com/j2se/1.4.2/docs/api/java/nio/channels/ReadableByteChannel.html


The point here is that there is only one way to get unstable data, and that is to use the FileChannel.read(byte[],position) method in an environment that allows multiple threads to access the same FileChannel.
Well, there are other problems if you use a shared FileChannel using implicit-position methods, because there can be interruptions between setting the position and reading or writing. I gave an example of this earlier in this thread. This won't lead to reading XXXOOOOOOO when XXXXXXXXXX or OOOOOOOOOO is expected - but it might lead to reading YYYYYYYYYY (a different record entirely) instead.


True, but the XXXOOOOOOO point was the one I was clearing up.


Unless, of course, we use synchronization. Which is actually pretty simple. Or, unless you use separate FileChannels as you prefer.
As I recall, you're not doing this [sharing FileChannels], correct?
Who - Vlad? Dunno, I've lost track. But there are a number of different people in this thread, and I know that some (like myself) aren't currently doing the just-in-time thing. The FileChannel question was first raised in this thread by S Bala, who seemed to be talking about multiple threads accessing a FileChannel, as are a number of the other people in this thread. If you want to talk about how FileChannel can be used safely in the contect of just-in-time FileChannel creation, that's fine - but I don't think this context has been clearly established in this thread for a lot of these comments. Many of the people reading this thread will not assume just-in-time creation unless it's explicitly stated.


I thought I did explicitly state it?


For those people, I say don't think of FileChannel's operations as "atomic" because while many are, some are not, and there are just enough holes in the system to screw you up if you're not careful.


I think what's more insidious is the you're creating multiple nested locks by synchronizing of various objects. While efficiency is not a consideration on this project, complexity is huge. I'm trying to help people see some the cost of their decisions. Ok, you want to keep a FileChannel around in case you need it: but, are you aware of the fact that you'll have to synchronize on it in order to use it safely? And did you know that this nests locks, thus increasing complexity and the opportunity for deadlock? If yes, then ok. but if not, then there are other ways.


In contrast, syncronization is a simple and easy option, IMO:


It's simple and easy, but it adds the spectre of nested locks. If you're safe threading advocate, as I am, you tremble at the mere thought of nested locks .


Note that I'm still putting those reads and writes in loops. That's because of the other problem with FileChannel, mentioned in several other threads above. Even if reads / writes are atomic, there's no actual guarantee (that I can find) that a read or write will actually use all the bytes we expect.
That is, if there's still space in a buffer, and the file still has unread bytes, a read() may nonetheless return prematurely, without filling the buffer. At least, the API seems to imply this, and offers no guarantee otherwise.


This is an incorrect statement, as shown http://java.sun.com/j2se/1.4.2/docs/api/java/nio/channels/ReadableByteChannel.html. Only one of the read methods is susceptible to this( potentially; it's arguable that it's not so susceptible).


Now, I haven't been able to actually observe this problem in testing. Maybe it never actually comes up. I know it comes up with FileInputStreams fairly often, so I tend to assume it's possible here unless guaranteed otherwise. And I would argue that if partial reads/writes are possible, then they actually destroy the safety of just-in-time FileChannels. It doesn't matter if each read() or write() is atomic, if it's possible for the method to complete without delivering all the intended bytes. You can put the method in a loop as I do above - but without synchronization, there's no way to prevent interruption between loop iterations.


Partial writes are not possible, and only one read doesn't explicitly guarantee safety.


And even with synchronization, if you're using just-in-time FileChannels then each FileChannel is a separate instance. You'd need to sync on some shared shared monitor. A static variable or some such.


Again, I believe this is incorrect and misleading. Since write block until complete, then there's not need for an extraneous lock.



In contrast, if you have one FileChannel shared by many threads, you can simply use synchronization as I do above to guarantee that each read or write will be uninterrupted. Synchronization is not the complex beast often it's made out to be if you just use it. It's true that in some situations synchronization can be a performance bottleneck. But as has been frequently observed here, performance need not be a significant concern for this assignment.
I acknowledge that most of my concerns here are very unlikely to manifest as observable problems in this assignment. And if someone wants to bypass synchronization because they consider the risks to be acceptably low for their purposes, that's fine, as an informed decision.


Well, let's make sure we've considered all the details here. All writes, and 3/4 of the reads, act atomically, So if you know your API, there's no reason to extraneously synchronize.
Remember, synchronization is not a magic bullet. I can cause more problems then it causes. If you're nesting locks, and one gets swallowed, then you're bankrupt.


But if people believe that there are actual guarantees that this problems cannot occur, that's what I disagree with.


There are guarantees Jim, you just have to read the fine, fine print
All best,
M
[ July 24, 2003: Message edited by: Max Habibi ]
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Bob Reeves:
Hi All:
Not another post on the FileChannel! Sorry, but I want to add one detail.
Vlad's comment about whether writing three bytes to the start of a long file might not cause a block seems reasonable. Not to soapbox, but perhaps it's not good to microread text that's not professionally written. The javadoc for FileChannel says something like any operation that can change the file length will block. It doesn't say "any operation that changes the file" will block. Maybe this is the emphasis the author intended. The purpose might be to increase the performance of write operations by permitting concurrency.
Tx


But, the problem with Vlad point is that the compiler would have to examine your code, and know that you're only doing 3 bytes. Since it can't reasonably be expected to do this, then it has adhere to the specification. That doesn't mean that all methods always do so, but it's important to be clear that the method can't pick and choose when to adhere to spec and when not to.
Also, remember that the Javadoc was written by engineers(good ones) for engineers. I'm guessing that when they say that something functions a certain way, they really believe that it does.
M
[ July 24, 2003: Message edited by: Max Habibi ]
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Bob Reeves:
Hi All:
Not another post on the FileChannel! Sorry, but I want to add one detail.
Vlad's comment about whether writing three bytes to the start of a long file might not cause a block seems reasonable. Not to soapbox, but perhaps it's not good to microread text that's not professionally written. The javadoc for FileChannel says something like any operation that can change the file length will block. It doesn't say "any operation that changes the file" will block. Maybe this is the emphasis the author intended. The purpose might be to increase the performance of write operations by permitting concurrency.
Tx


But, the problem with Vlad point is that the compiler would have to examine your code, and know that you're only doing 3 bytes. Since it can't reasonably be expected to do this, then it has adhere to the specification. That doesn't mean that all methods always do so, but it's important to be clear that the method can't pick and choose when to adhere to spec and when not to.
Also, remember that the Javadoc was written by engineers(good ones) for engineers. I'm guessing that when they say say that something functions a certain way, they really believe that it does.
M
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Vlad Rabkin:
Hi Max,
Hi Jim,
[Max] You'll notice that FileChannel.write(byte[],position) can change file size.
Well, I thought about that 100 times, but what does it exactly mean???
Let's say write with FileChannel.write(bytebuffer, position), where bytebuffer has only 3 bytes on the beginning of the file (So, I am sure that this operation will not change the size of the file). Theoretically each write can change the size of file, but in practically this one NOT. So, is atomicy guaranteed or not???
Vlad


Hi Vlad,
I'm glad you're getting so into this . The atomicy is guaranteed, because the spec says that any operation that may change file size is atomic. The compiler can't examine your code and figure out that you're not really going to be changing file size. When you ask for a writable FileChannel, and you call write on it, it blocks. That's what the language specifications say, anyway. And if you trust then language specs written by Sun, then Sun can't very well fault you if the specs go wrong.
Or think of it this way: If you've never seen this fail(and I'll bet that you haven't), then there's no reason to assume that the spec is anything other then correct. So then, the only discussion point is, does the spec say that it's not atomic? I don't believe so. However, if you see a place where it seems to do so, then you shouldn't use it. Does that seem reasonable?
btw- I tend to take a hyper mathematical/engineering view of these sorts of things. my father is a Ph.D in math, so logical consistency is really important to me(typical dinner conversation around the Habibi household was proving that 1 was not equal to 0). Thus, I'm somewhat overzealous in my logical premisemenship.
M
 
Bob Reeves
Ranch Hand
Posts: 64
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Max:
This responds to your comment:

But, the problem with Vlad point is that the compiler would have to examine your code, and know that you're only doing 3 bytes. Since it can't reasonably be expected to do this, then it has adhere to the specification. That doesn't mean that all methods always do so, but it's important to be clear that the method can't pick and choose when to adhere to spec and when not to.


The motivation for by post was a decompile of FileChannelImpl, which is the type of the obect returned from RandomAccessFile's getChannel.
The decomile of FileChannelImpl shows no embedded JVM Monitor enter/JVM Monitor exit calls in any of the write methods. But all the position methods have them. Now FileChannelImpl calls IOUtil and FileDispatcher. Again neither of these contain Monitor calls in their write methods.
So, I think you can see why I conclude that the block on write must be in the native code. I'd also opinion that native code protects itself on block (ie. so that its file on disk information remains valid), not the user. Thus my conclusion.
Notice there is no customization of the compiled code for short buffers. Of course, the compiler couldn't do that! I think the native code looks at the buffer length, and branches according to file length. (Actually, the only time it must block is if the truncate method is called after the write. I'm guessing truncate isn't called frequently, so it might make sense to optimize this way.)
Tx
 
Vlad Rabkin
Ranch Hand
Posts: 555
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,
I've decided to dig into nio, but I synchronize read and write explicitly. I hope not to be penaltied for it.
Could anyone comment my code: It is my first attempt to do it- It works, but I am not sure the
the code is good (e.g. I don't if I should use force() method):



Regards,
Vlad
 
Max Habibi
town drunk
( and author)
Posts: 4118
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
/*

Originally posted by Bob Reeves:
Hi Max:
This responds to your comment:

The motivation for by post was a decompile of FileChannelImpl, which is the type of the obect returned from RandomAccessFile's getChannel.
The decomile of FileChannelImpl shows no embedded JVM Monitor enter/JVM Monitor exit calls in any of the write methods. But all the position methods have them. Now FileChannelImpl calls IOUtil and FileDispatcher. Again neither of these contain Monitor calls in their write methods.
So, I think you can see why I conclude that the block on write must be in the native code. I'd also opinion that native code protects itself on block (ie. so that its file on disk information remains valid), not the user. Thus my conclusion.
Notice there is no customization of the compiled code for short buffers. Of course, the compiler couldn't do that! I think the native code looks at the buffer length, and branches according to file length. (Actually, the only time it must block is if the truncate method is called after the write. I'm guessing truncate isn't called frequently, so it might make sense to optimize this way.)
Tx



It might very well be in the native code: however, the compiler, when dealing with a method like

can't know beforehand what size of the ByteBuffer will be, nor can it can know, just because you allocated a ByteBuffer, that you'll be writing it. For that matter, in order to know what the actual size of the File was before it started writing, the FileChannel would have to synchronize on the file(in case another FileChannel is writing or deleting from it), in order to make the sort of claim that the language specs makes about writes being atomic: there is no other way for it to make that claim. Thus, the logical sequence must be either
1. lock down writes
2. read current file size
3. write content
4. release lock
or
1. lock down writes
2. read current file size
3. (since this particular write doesn't change the size of the file)
4. release lock
5. write content

Now, that latter example, while it might be optimized, still allows another thread to theoretically sneak in, write over the intended area, and sneak out(between steps 4 and 5). Thus, it allows a hole in which writes are not thread safe. I'm pretty sure that if I can pick this up, the smart fellows at Sun can too
Thus, the FileChannel is forced to lock on writes, since it guarantees exclusivity. Now, if it does so in Native code or not is immaterial. It's still obligated to do so, or be in violation of the spec.
All best,
M
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic