• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

B&S: FileChannel, force(), dirty reads, synchronized with physical file

 
Michal Charemza
Ranch Hand
Posts: 86
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all,

I'm trying to design my locking/file classes, and I'm trying to avoid the possibility of dirty reads. I have looked through other threads, and looked at the Java API, specifically at the URL=http://java.sun.com/j2se/1.4.2/docs/api/java/nio/channels/FileChannel.html]FileChannel[/URL] class and the WritableByteChannel interface. However, I am unable to answer a few questions.


  • I can avoid the possibility of booking a changed record by: checking that a record matches what it was when the user booked it. This would be done after locking the record.


  • What about this senario: One thread starts to read a record, stopping half-way through, another thread updates the record, and then the first thread completes reading the record, ending up with half the old record, and half the new record, or possibly garbled data. Is this an example of a dirty read?

    I've looked at the FileChannel doc in the API. Because I'm multithreading, and plan to use one FileChannel object, I think I need to use absolute positioning in all my read/writes. This is because if I set the position of a channel before writing, another thread can cut in and change it to something else.

    However, as in the API it says absolutely positioned read/writes can happen concurrently, then the garbled data situation above could happen. Can this be avoided without a read-write-lock system, and without caching the whole database in memory?

    I've thought about this question, and I can come up with one answer that may work, and it seems a bit complicated for this project:

    Whenever writing to a record, the original (record) is cached. Whenever a client reads a record, it would never know whether whether it is actually straight from the hard drive, or from a cached copy.

  • The existence of the force() method in FileChannel seems to suggest that somehow the "contents" of the FileChannel may not be synchronized with what is actually in the physical file on the hard drive.


  • When a write() called, exactly at which point is what is being written available to other read() calls in other threads (assuming absolute positioning)? When the write call returns? Or is it "bit by bit", i.e. a little bit is written, then a read call may get that little bit, a little bit more is written, and another read call may get that little bit more and so on?

    Perhaps just a link to a thread/web page that explains these things would be very helpful.


    Any thoughts would be gratefully received.

    Phew! That's it. I realise this is a long post. If it is too long, please tell me and I will try to restrict the length (more) in future.

    Michal

    [ August 31, 2004: Message edited by: Michal Charemza ]
    [ August 31, 2004: Message edited by: Michal Charemza ]
     
    Paul Bourdeaux
    Ranch Hand
    Posts: 783
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Hi Michal,

    What about this senario: One thread starts to read a record, stopping half-way through, another thread updates the record, and then the first thread completes reading the record, ending up with half the old record, and half the new record, or possibly garbled data. Is this an example of a dirty read?

    The general idea of logically locking a record is to ensure that it cannot be modified in the middle of an atomic operation, such as a read. Don't confuse the locking with synchronization - they serve two different purposes. (It's easy to do, trust me :roll: )

    Check out this link for some more about locking:
    http://www.coderanch.com/t/186059/java-developer-SCJD/certification/Locking

    [ August 31, 2004: Message edited by: Paul Bourdeaux ]

    [ August 31, 2004: Message edited by: Paul Bourdeaux ]
    [ August 31, 2004: Message edited by: Paul Bourdeaux ]
     
    Anton Golovin
    Ranch Hand
    Posts: 530
    1
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    I am not sure if I am posting too much code, but here is my read method:



    Here's the method which avoids reading outdated data:



    Here's the method which checks if a record is locked:



    When I read a record from cache, I make sure it is not locked. When I read a record from the data file before caching it, I always synchronize on writer RandomAccessFile.
    [ August 31, 2004: Message edited by: Anton Golovin ]
     
    Michal Charemza
    Ranch Hand
    Posts: 86
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Hi Paul

    Originally posted by Paul Bourdeaux:
    The general idea of logically locking a record is to ensure that it cannot be modified in the middle of an atomic operation, such as a read.


    I thought that a read is only an essentially atomic operation if read-(and write)-locking is implemented - am I mistaken? That is, a client can only read from a record if it owns the lock on a record. I don't think i wish to implement a read-(and write)-lock, only a write-lock. My spec doesn't ask for it, and I think it would greatly reduce concurrency (I think that's the word I mean) - clients would then not be able to read the same records at the same time.

    I would like to know if there are any ways to avoid the "garbled" data stituation without implenting a read-lock, and without caching the whole database in memory.

    Michal
     
    peter wooster
    Ranch Hand
    Posts: 1033
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator

    What about this senario: One thread starts to read a record, stopping half-way through, another thread updates the record, and then the first thread completes reading the record, ending up with half the old record, and half the new record, or possibly garbled data. Is this an example of a dirty read?


    The usual means of avoiding dirty reads is to synchronize all file access methods. This basically means that your file access is single threaded.
     
    Michal Charemza
    Ranch Hand
    Posts: 86
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Hi Anton,

    This is what I understand from your code:

    Your code doesn't guarantee that a write will not be taking place while a read it, it tries to avoid it, by making sure there is not a write lock on a record before it moves into the main bit of the read method (and by that I mean the "if" statement onwards). It's not guaranteed because once it has moved into the main bit of the read method, other threads are free to get the write lock on the record and write to it, before your read method completes.

    Is this an accurate understanding?

    If it is, then I'm not sure if I would do it in my project. I don't think I like using behaviour that "tries" to do something, and is not guaranteed to do it. I don't know if it is good/bad Java programming practice, but it just doesn't sit right with me.


    Michal
     
    Michal Charemza
    Ranch Hand
    Posts: 86
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Hi Peter

    Originally posted by peter wooster:
    The usual means of avoiding dirty reads is to synchronize all file access methods. This basically means that your file access is single threaded.


    Do you know what? I didn't think of this. However... is this a good idea? Shouldn't concurrent reads be allowed? Also my spec says "your server must be capable of handling muliple concurrent requests". Does having synchronized file access violate that?

    Michal
     
    Michal Charemza
    Ranch Hand
    Posts: 86
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Hi Anton,

    I missed the "synchronized" keyword in your method. So my above post is very wrong, assuming that your write method is also synchronized.

    I think I do now see that your code does guarantee that a write will not happen in the middle of a read, or vice versa.

    However, this does seem to also say that concurrent reads will not take place... is there a way around this?

    Also... I have now forgotten the point of write-locking in the first place. Is it to verify data is what you expect it to be before you change it? No answer is fine to this question, I'm imagine I will find it in another thread. But I will do that tomorrow... I need sleep.

    Michal
     
    peter wooster
    Ranch Hand
    Posts: 1033
    • Mark post as helpful
    • send pies
    • Quote
    • Report post to moderator
    Originally posted by Michal Charemza:
    Hi Anton,

    I missed the "synchronized" keyword in your method. So my above post is very wrong, assuming that your write method is also synchronized.

    I think I do now see that your code does guarantee that a write will not happen in the middle of a read, or vice versa.

    However, this does seem to also say that concurrent reads will not take place... is there a way around this?

    Also... I have now forgotten the point of write-locking in the first place. Is it to verify data is what you expect it to be before you change it? No answer is fine to this question, I'm imagine I will find it in another thread. But I will do that tomorrow... I need sleep.

    Michal


    Most reads will come out of file system buffers, so this is not a problem. The file system is likely to keep something like a LRU cache of buffers that probably contains the whole file on any system that isn't very busy. The reads will be very fast most of the time. If the system is busy the most popular stuff will be in the buffers.
     
    • Post Reply
    • Bookmark Topic Watch Topic
    • New Topic