• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

RandomAccessFile.length() causes problems when multithreading

 
Ronald Wouters
Ranch Hand
Posts: 190
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all,

I would like to share with you and get your opinion on a strange problem
that I encountered during multithreading tests.

In my Data.read method I HAD the following:


During my multithreading junit tests I discovered the following problem:

ConcurrentDataMTTest.setUp
Oct 12, 2005 6:51:55 AM suncertify.db.Data <init>
INFO: Database opened
MTT-2 error reading record name=Pandemonium location=Metropolis --> record 6 instead of 10
MTT-1 error reading record name=Grandview location=Digitopolis --> record 11 instead of 5
Testing ConcurrentDataMTTest (testRead)
Oct 12, 2005 6:52:05 AM suncertify.db.Data closeDatabase
INFO: Database closed
Thread [MTT-1] failed 1 out of 230696 runs in 10 seconds, averaging 23069 runs per second.
Thread [MTT-2] failed 1 out of 186537 runs in 10 seconds, averaging 18653 runs per second.
ConcurrentDataMTTest.tearDown

junit.framework.AssertionFailedError
at suncertify.db.ConcurrentDataMTTest.testRead(ConcurrentDataMTTest.java:127)

As you can see from the junit test results above, about once in every 200000 reads
the Data.read got the wrong record !
The above results are for my Linux (IA32) workstation. (RHEL 4 WS U2, java version "1.5.0_05")

Running the same test on Windows XP workstation were a complete disaster !!!
I got multiple IO errors and the MTJUnit tests simply crashed.

After some experimenting I found that by putting the call to raf.length also
in the synchronized block (see below), all the errors were gone both on linux as on WinXP.



ConcurrentDataMTTest.setUp
Oct 13, 2005 5:00:15 AM suncertify.db.Data <init>
INFO: Database opened
Testing ConcurrentDataMTTest (testRead)
Thread [MTT-01] failed 0 out of 2143989 runs in 300 seconds, averaging 7146 runs per second.
Thread [MTT-02] failed 0 out of 236049 runs in 300 seconds, averaging 786 runs per second.
Thread [MTT-03] failed 0 out of 1127535 runs in 300 seconds, averaging 3758 runs per second.
Thread [MTT-04] failed 0 out of 466484 runs in 300 seconds, averaging 1554 runs per second.
Thread [MTT-05] failed 0 out of 900900 runs in 300 seconds, averaging 3003 runs per second.
Thread [MTT-06] failed 0 out of 1372098 runs in 300 seconds, averaging 4573 runs per second.
Thread [MTT-07] failed 0 out of 2068896 runs in 300 seconds, averaging 6896 runs per second.
Thread [MTT-08] failed 0 out of 1134094 runs in 300 seconds, averaging 3780 runs per second.
Oct 13, 2005 5:05:18 AM suncertify.db.Data closeDatabase
INFO: Database closed
Thread [MTT-09] failed 0 out of 3308340 runs in 300 seconds, averaging 11027 runs per second.
Thread [MTT-10] failed 0 out of 1027210 runs in 300 seconds, averaging 3424 runs per second.
ConcurrentDataMTTest.tearDown

The question I have is this:
In the API docs for RandomAccessFile.length it doesn't say anything about the
filepointer. Looking at the source code for RandomAccessFile I can see it is
a native method



Does anyone know why calling the length method outside of a synchronized block
would cause the filepointer to get mixed up ?

Bottom line and my suggestion to you all:
If you should use raf.length make sure it is synchronized and also check out
MTJUnit, it's great and saved me from a potential failure for becomming a SCJD.

Regards,
Ronald Wouters
 
Oricio Ocle
Ranch Hand
Posts: 284
Debian Firefox Browser Netbeans IDE
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Ronald,
thank you for sharing this interesting issue.

Given that raf.length() is doing more that one can assume, don't you think it would be better having a counter for number of records instead of calling raf.length() each time?

Regards
 
Ronald Wouters
Ranch Hand
Posts: 190
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Oricio,

the suggestion of maintaining a record counter started me thinking, a lot ...
I finally decided not to implement it but instead did the following:

I made sure that the call to raf.length was only done during the Data.read method and not during searches (Data.find).
During a search that would be too much overhead and a possible performance issue when the database becomes large enough and multiple users or doing searches at the same time.

Maintaining a record counter in itself would not be a simple thing. It would mean considering deleted records, reclamed space when creating records and this would also introduce it's own set of synchronization issues.

The multithreaded tests I did (see my earlier post) resulted in almost 46000 invocations of the Data.read method per second !
During normal use of the application this method would be called when a room was being booked, so during a lock/read/update/unlock sequence.
If almost 46000 people were booking a room with URLyBird each second, the CEO of URLyBird would be a very rich man...


Of course, I document all this in my choices.txt.

Regards
 
Ronald Wouters
Ranch Hand
Posts: 190
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I did some more thinking and came up with a more elegant solution: get rid of the call to raf.length() completely !

I was looking at the method signature for the length method again
;
when the oh so obvious thing that was staring me in the face, hit me :
the length method throws an IOException, meaning it performs an I/O operation, obviously !

My purpose for having the call to the raf.length() method in the first place was to avoid any unnecessary read operations. The purpose should have been to avoid any unnecessary I/O operations ...
The obvious point is that the read and length methods are both I/O operations.
So, in fact, in an attempt to make things better I made things worse.
When I tried to read a non-existing record, only one I/O operation happened, the call to the length method. Reading an existing record now actually took two I/O operations, the call to length and the call to read !

Thanks, for making the suggestion about the record counter. As you see, it started quit a brainstorm.

Now I will have to update my choices.txt again, but hey, that's the easy part.

Regards.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic