• Post Reply Bookmark Topic Watch Topic
  • New Topic

threads writing to same file

 
Marlene Miller
Ranch Hand
Posts: 1392
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Suppose two threads write to the same file. What could go wrong? Is it possible for any �record� of 8 bytes to be a mixture of data from thread-1 and thread-2?

Since I have no state information, I don�t need to synchronize on an object.

In my work-world, I have always assumed, perhaps incorrectly, that records written to a file are queued to a disk process. The disk process manages I/O requests from multiple processes in a safe way.
 
Stefan Wagner
Ranch Hand
Posts: 1923
Linux Postgres Database Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I don't know, whether it is save, but if you use a queue, to cache the records, and write them from the cache, it should be easy, to synchronize the queue - shouldn't it?

An aggressive Test could give you a hint too - of course only for your platform.
 
Joe Ess
Bartender
Posts: 9361
11
Linux Mac OS X Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Marlene Miller:

In my work-world, I have always assumed, perhaps incorrectly, that records written to a file are queued to a disk process. The disk process manages I/O requests from multiple processes in a safe way.


Your assumption ignores the fact that you are working within the Java Virtual Machine, so even if the disk access were serialized from the OS's point of view, the VM may interrupt one of its threads from writing to the OS's interface and start another one writing. You are depending on a contract that does not exist in the Java API. If it did, one would think that the write methods on RandomAccessFile would be declared to be synchronized to indicate that they are thread safe.
There are very few operations that are guaranteed to be atomic (a single action within the VM) and therefore thread safe. Reading and writing a 64-bit double, for example, is not atomic because Java is a 32-bit VM and has to perform two operations to move 64 bits. Moving an array of bytes is almost certainly not atomic.
If you want to safely access a file from more than one thread you would be wise to wrap the file access with synchronized methods. Stefan's suggestion makes sense as well since using a queue frees your thread from waiting around for other threads and for the actual write overhead.
 
Marlene Miller
Ranch Hand
Posts: 1392
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thank you very much Stefan and Joe. It makes sense what you have said. The byte is the unit of atomicity. It's not reasonable to ask one thread to wait for another thread to write a whole record. A solution is the queue.

Thank you. Marlene
 
VijayRawat Rawat
Greenhorn
Posts: 9
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi All,
I also have the same problem where multiple threads will be writing to the same file, I don't want data to be garbled up rather I want each thread write sequentially there data in the same file. I am creating wrapper classes but I am getting some kind of problem in my approach. I saw word QUEUE being referred here. Can anyone throw some light on this approach , it would be great

Thanks in advance
Vijay Rawat
 
David Harkness
Ranch Hand
Posts: 1646
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by VijayRawat Rawat:
I saw word QUEUE being referred here. Can anyone throw some light on this approach , it would be great

A queue is a standard data structure that works exactly as it sounds. It is a list with the contract that items are placed on one end and taken from the other.

Imagine a single line of people waiting in a bank for 3 teller windows. People come in from multiple doors and walk to the back of the line. When a teller becomes free, the person at the front of the line moves to the teller. In Marlene's case above, there would be only one teller (a thread writing records sequentially to the file) and multiple doors (the threads writing records to the queue).

Note, however, that queues are not specific to multithreading; they simply turn out to be quite useful. For a good set of thread-safe and well-tested implementations of queues and other related classes (semaphores, locks, thread pools, etc), see Doug Lea's Concurrent library. They are being included in J2SE 1.5 as java.util.concurrent and are quite handy. Doug is also the author of Concurrent Programming in Java.

One thing to consider, Marlene, is that the file will throttle your queue-processing thread, but the threads writing to the queue will not be bound by file I/O. This means that if your threads produce a large number of records very quickly, the queue will happily grow to consume available memory. You can use a BoundedLinkedQueue that will make writing threads wait when it is at capacity if this is a concern.
 
Marlene Miller
Ranch Hand
Posts: 1392
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thank you David for the suggestion to use a BoundedLinkedQueue.

VijayRawat,

Here is an example of threads, a queue and writing to somewhere from the Java Programming Language. Doug Lea�s utilities are more appropriate for real work.

[ August 19, 2004: Message edited by: Marlene Miller ]
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!