Yesterday I've sat seriously with my SCJD assignment (URLyBird 1.1.1) and I've got some questions about the locking mechanism.
1. I want to use a Singleton pattern for the Data class,
2. I will create a recordCache (TreeMap) [ recNo => String ] which will read the records during startup of the server and will do CRUD operations on it. This structure will be written into the database flat file at the exit of the application. Perhaps it can be modified to do some kind of checkpoints in the background to prevent lost-edits when application crashes. TreeMap will let me select the first never-used record number that might be used as an insertion point, if no other record is available for reusing.
Data class will use some kind of DataAccessManager to load or commit changes to the flat file.
3. I will use a HashMap as a lockMap which will map RecNo => LockCookie [ Integer => Long ].
4. When I thought about the locking mechanism I draw a pseudocode which creates few other questions like:
4.a In my opinion there is no need to synchronize the read method (neither on the recordCache nor the lockMap) [read 5.b for more details],
4.b Create, delete and update need to be synchronized on both - recordCache and lockMap,
4.c Lock and Unlock methods are synchronized not only on the lockMap but also on the recordCache (to be sure that the record does exist when we create a lock on it; if it doesn't - RNFE is thrown)
4.d When I said that the synchronization is on both maps, then I mean this is a synchronized block inside synchronized block. I'm curious if it isn't a possible flaw in the design, as simple change in the order of synchronization blocks can lead to the deadlock. OK - I remember that I synchronize always first on lockMap and then on recordCache, but isn't it too dangerous for the successor "Junior Java Developer"?
5. I've got a problem when thinking about find method... Basicaly, my business layer consists of two methods - search(String criteria) and book(Integer recNo). The problem I address is the possible data change between getting the ID's (from find method in Data class) and executing read for each ID.
Between these steps, some thread can change the data and while fetching each record detailed data we can get some that doesn't match our criteria:
So we get a set in which one of the hotel's names (record no 5) is not 'MyHotel' which we wanted.
5.a Solution - locking the recordCache while executing Data.find(-) method? A big hit for parallel execution.
5.b Solution - doing second pass record filtering in business layer? OK - it might be necessary anyways, because the Data.find(-) returns "criterion*" matches, while the business layer should return only exact matches ("criterion").
Then again - we are facing the bigger problem here - some records could be changed during our find operation, so right now (when we filter the records in business layer) the database consists of some records that matches the criteria but wasn't found.
I guess that this non-repeatable read matter should not be taken under consideration (or possibly, this assumption should be noticed in project documentation) as I can only think that Solution 5.a would solve this problem.
6. In my locking mechanism I am not using any ReentrantLock, ConcurrentMap or any of the classes in java.util.concurrent, as I think that the mechanism based on synchronized blocks should work fine. Am I making a mistake, and should use rather the java.util.concurrent.* classes to satisfy my assessor? Is the using of Java API more valuable than using mechanisms built in the language itself?
7. When the lock cannot be acquired at the precise moment, I am using the Object.wait() method in a while loop checking if the lock can be acquired. The problem is that I am waiting on the whole lockMap object. The unlock method is notifying all waiting objects for this lockMap (when it is releasing ANY of the locks).
7.a This solution has a drawback, as each unlock operation is notyfing all waiting Threads, even if the released lock wasn't the one they were waiting for.
7.b I was hesitating to lock on a single lockMap's value, as I wasn't sure if it won't be a possible design flaw, when the cookieValue will be created using autoboxing and therefore two different cookies might exist in the lockTable with the same reference (because of the "Long pool"). But I guess if only the developer will remember to create the cookies with new Long(-) method it should be fine - buy then again, this is a potential risk...
7.c Another solution could be to create a LockManager class which will hold the pair - ReentrantLock and lockValue (cookieVal). The original lockMap will be changed from simple [recNo => lockValue] into [recNo => lockManagerEntity]. This will give the ability to notify only those Threads that are waiting for the specific lockedRecord and to change this pseudocode:
I'm not sure which one is more intuitive or if the notifying threads waiting only on the specific record is so important that it's worth of implementing ReentrantLock and complicating a bit the lockMap...
Gosh, that's a one hell of a big post - you should get a reward for getting to this place (of course if you didn't cheat and didn't go here at the first place! :-)
A lot of questions came into my mind but these are the most disturbing ones. Thanks in advance for all replies :-)
2/ I have no idea why you would opt for a TreeMap, no need to sort the records (and the first never-used record seems odd to me). I just used a HashMap.
3/ I used the same approach.
4/ All my methods are marked as synchronized, so no need to do some juggling with synchronized blocks. If you always synchronize in same order, there will be no problem. For the junior developer, just make sure your code is documented (very) well.
5/ I added a find-method to my own interface (which extended the given required one) which returns record numbers and the corresponding records (as a Map<Integer, String>)
6/ I also didn't use the new concurrency api, just simple wait and notifyAll
7/ I implemented my locking as you described, with the drawback you described in 7a
Pedro Kowalski wrote:you should get a reward for getting to this place
AD. 2 - Roel, if you are adding a record, do you reuse the deleted records? And if you don't find any deleted reusable record, are you creating a new record at the end of the file (recordCache) with the lowest possible number? If so, then how do you know what record number should be used? You keep some kind of static variable maxRecNo or do you iterate through whole recordCache finding the max? Perhaps TreeMap is too advanced structure for only the purpose of finding the firest available recordNumber, but that's the only benefit I was interested in.
AD. 4 - If you're methods are all synchronized, and the Data class is a Singleton, than isn't the data access really a serial one rather than parallel? The business layer is multi-threaded, but isn't the whole flow blocked a bit by the serial Data class?
AD. 5 - So when you added the synchronized modifier, you've got your atomic find-ids-fetch-records method, right? Quite nice, as it solves the problem of the data change. Though it is blocking other threads even from concurrent read-operations, right?
I reused deleted entries. When no deleted entries exist anymore, I just use the size of my Map for the next record number. My records are numbered as 0, 1, ... So when I have a map with 30 (non-deleted) records (numbers from 0 - 30), the next record that's created will have 30 as record number.
I agree, my solution is not the most performant one, but there is no requirement about performance. And don't forget: according to the instructions an easy, simple approach is preferred above a more complex but more performant one. And it can't get a lot easier than marking all methods synchronized.
When a thread is searching for records, other threads have to wait until this thread has finished searching for records. That's the same issue as the previous point. And as a side note: i'm using a record cache, so no (slow) file access is needed.