• Post Reply Bookmark Topic Watch Topic
  • New Topic

Why we should prefer synchronization?

 
Sanj Sharma
Ranch Hand
Posts: 32
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Why we should prefer synchronization?
 
Ivan Tamayo
Ranch Hand
Posts: 49
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In what scenario?
 
Mark Spritzler
ranger
Sheriff
Posts: 17290
9
IntelliJ IDE Mac Spring
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Donk, I don't believe your question is a Servlets question. I am going to move this thread to the Java Beginner's Forum.
Mark
 
Dirk Schreckmann
Sheriff
Posts: 7023
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Welcome to JavaRanch!
I suppose you're meaning to ask about why some things should be synchronized. Take a look at the Synchronizing Threads section of Sun's Java Tutorial for an introduction to synchronization in Java.
From that page:
[T]here are many interesting situations where separate, concurrently running threads do share data and must consider the state and activities of other threads. One such set of programming situations are known as producer/consumer scenarios where the producer generates a stream of data which then is consumed by a consumer.
For example, imagine a Java application where one thread (the producer) writes data to a file while a second thread (the consumer) reads data from the same file. Or, as you type characters on the keyboard, the producer thread places key events in an event queue and the consumer thread reads the events from the same queue. Both of these examples use concurrent threads that share a common resource: the first shares a file, the second shares an event queue. Because the threads share a common resource, they must be synchronized in some way.

Now, let me suggest that any further questions on synchronization be asked over in our Threads and Synchronization Forum.
 
sever oon
Ranch Hand
Posts: 268
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The idea of synchronization only applies when you have multiple threads running, modifying the same data. When you have more than one thread, keep in mind that the operating system (or the JVM) schedules when one thread will swap in and get run, and you as the developer have no control over it. For example, if you run one thread that accesses a database, and another one that displays a progress meter on the screen, they operate independently and simultaneously. This means the user can see the progress meter changing while the database update is happening.
On the other hand, they must communicate or share data in some way--this means they're going to be editing the same data...possibly at the same time. Let's see why this is a problem.
Let's say you write a President class that two threads are editing:

Let's say this is part of an application that operates in a classroom. There are several computers running, one on each student's desk, and they are answering random questions about presidents. Whenever a student gets a question right, on the teacher's computer (the server), the lastPresident variable is updated to let the teacher know the president associated with the last correct answer. (A contrived example, but so what. You're learning.)
So, somewhere in the server code running on the teacher's computer, this variable lastPresident is created and each time a student answers a question about a president correctly, that student's computer communicates with the teacher's and updates lastPresident. If student 5 answers correctly about who cut down a cherry tree, the teacher's machine displays "George Washington". If a moment later, student 11 answers about which president wore a stovepipe tophat, the teacher sees "Abraham Lincoln". Let's see what happens if these two students offer the same answer at roughly the same time...
On the teacher's computer, two requests come in simultaneously and try to edit the lastPresident variable. Thread 1 (associated with student 5's request) set's the givenName variable to "George". Then it gets swapped out by the processor and thread 2 is swapped in (associated with student 11's request). It runs to completion, setting givenName to "Abraham" and surname to "Lincoln" and terminating. After that, at some point thread 1 is swapped back in and completes its task, setting surname to "Washington". The problem, of course, is that in the space of a couple of microseconds the teacher's computer now reads: "Abraham Washington". In this example, it's not particularly important which one is considered last, as long as its one or the other. But what's happened instead, of course, is the data got corrupted.
So, the solution is to set up a block of code that is synchronized. Whenever a thread enters a sync block, it must acquire a lock on the object. All threads compete for this lock, and the early bird gets the worm. Once one has it, though, it keeps it until it exits the sync block (in this case, the sync block would cover setting both the givenName and the surname). While thread 1 has the lock, for example, thread 2's attempts to enter the block would be met with the requirement that it try to get the lock. It would try, see that it was in use, and go to sleep. Once the lock is released by thread 1 (state of the object: "George Washington"), thread 2 picks it up and replaces it with "Abraham Lincoln", during which no other thread can acquire it. Data integrity is preserved.
This requires that the President class have a sync block as such:

This code represents one of the flaws of Java. When you get into learning Java, you'll quickly learn about "getters" and "setters", also called "accessors" and "mutators", respectively. This idea (advanced as part of the JavaBeans specification) is, unbelievably, *incorrect* by all general object-oriented programming methodologies! For example, if you study DbC (Design by Contract), you'll learn about the concept of classes defining their state, and this talks about them providing algorithmic access to data that defines the state of the class. Nowhere is there any discussion requiring that data be *set* in any specific way.
This is where the JavaBeans spec goes wrong. Classes are defined by the state they expose through getters--not at ALL by methods that allow setting data. Data can be set in any fashion at all, and the methods that do this should be no more special than any other operation on the class. In the example code above, you see where the JavaBeans idea of providing setters for each bit of data runs into difficulty--even if you provide a sync'd method for setGivenName() and setSurname(), that's not good enough...a scenario still exists for George Lincoln or Abraham Washington to appear. It's only by setting both names in the *same* sync block can you write the class in the thread-safe manner. So, the JavaBeans request for setters is largely meaningless.
I provide setters wherever they make sense because they are generally useful, but I only ever picture them in my mind's eye as simply another operation on a class. A getter, on the other hand, I treat as a "basis" for the class. That is, all other operations on the class can be defined in terms of getters. For example, the setName() method in the example code above could easily be defined as:
Upon completion of:
setName( x, y ),
( getGivenName().equals(x) && getSurname().equals(y) == true )
This statement exactly defines the effect of calling setName() on a class in a way that can be captured in a unit test. For example, you could write a class that tests this President class by creating an instance, calling setName() with two Strings x and y, and then assert the exact statement that defines the behavior of setName() on the class.
This is highly useful. JavaBeans idea of forcing you to create individual setters for each, the givenName and surname bits of data, flies in the face of good OO design in many cases and complicates the issue by not allowing one to define all methods by a class' getters.
sev
 
sever oon
Ranch Hand
Posts: 268
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Oh, one more thing. The introduction of sync blocks into code solves the problem of data corruption, but in doing so introduces a new problem: deadlock. Consider what happens if you design a program in which two separate objects need to be modified in the same sync block. An example is helpful...
I used to work for a company that made MRI scanners (an MRI is like an xray machine, but instead of using x-rays to image bone it uses magnetic fields to image soft tissues like muscle and cartilage). The scanner software had to control several pieces of hardware...the scanner itself (which was internally composed of several pieces of independent hardware), a data storage rack (essentially a big rack of extremely dumb hard drives for storing scan data), and a printer so images could be printed and analyzed or given to the patient to take home or whatever. The problem: all of these scanners were running UNIX and could be logged into remotely. In this way, an operator could be scanning a patient while a radiologist could be logged in remotely and print out a completely different patient's scan, come down later and pick it up for analysis.
Each bit of hardware was represented by an object in the system. So for the printer, there was an instance of a Printer class, for the data storage rack, there was an instance of a Store class, etc. When the operator decided to print a scan, the software would acquire a lock on the printer instance, acquire a lock on the store instance, and that way no other thread (the remote radiologist's, for example) could interfere. Then the data would be transferred to the printer from the data store, it would be queued up, and print. Then the locks would release and the next thread would be able to acquire them and the radiologist could transfer a different scan.
The point is, each time a print occurred, a thread would have to acquire exclusive access to both the printer and the data store (otherwise, if two people printed scans at the same time, a double size scan would be printed of data randomly intermingled between the two source scans). This is all well and good...but what happens if thread 1 acquires a lock on the printer object, and thread 2 acquires a lock on the data store? Then thread 1 goes to get the lock on the store, and goes to sleep, waiting for it to be releaesd. Thread 2 does the same, tries to get the lock on the printer, and goes to sleep waiting for it to become available. Both threads will wait forever, waiting for the other one to release its lock...in other words, deadlock.
Now there are a lot of ways to ensure deadlock won't ever happen to you. I'll give you two and let you figure out the myriad other good design techniques you can also use. One is, never call a sync block from within another sync block. If thread 1 acquires a lock on printer, that means it's in a sync block...if it calls another method that acquires a lock on store, that means that method is a sync block on its own...a sync block calling another sync.
This one is hard to follow because it doesn't have to be a direct call. If you call method a(), which calls b(), which calls c(), and c() calls a synchronized method d(), you've violated this rule even though you might not necessarily know this chain of events was going to unfold. This is not preferable, therefore.
Method 2 is identify all of the sync blocks in the codebase, and make sure that exclusive access is never acquired simultaneously to any object with any other. This way of doing things has a caveat, though--what if you must acquire exclusive access to two objects simultaneously, as in our example (the data store and the printer--after all, you can't transfer data from one to the other if you're only assured exclusive access to one, right?). In that case, the corollary is, force EVERY thread to acquire access to these objects in the same order.
As it happens, in my application, the example I give is a specific, degenerate case of a far more complex problem. In the real situation, I had some 20 or so objects and I could not know if a thread might require access to all of them, some of them, one of them, or none of them simultaneously in order to accomplish a particular task. So, what I did was create an object I called ResourceManager (a Singleton, if you read Design Patterns by the Gang of Four, which you should). Every thread, when it wanted access to one of these resources, had to acquire it through the ResourceManager object. The ResourceManager could track which threads were trying to access which objects and force them to do so in a certain order. So, RM would assign to each resource a number from 1 to 20, and then all threads that wanted to get access to 4, for example, could only have previously locked 1 through 3. If some thread wanted 4 but already had locked 5, RM would put this thread to sleep, release resource 5's lock, give resource 4's lock to it, and then when 5 became available, grant it access to that one as well, and then wake it up. The end result, that thread would wake up and have both resources 4 and 5, and never know what happened in the interim--to that thread, it appeared that it got exactly what it requested (and it did).
By forcing them all to step through from 1 to 20, there was never a situation where one thread got 18 and was waiting for 4 while something else had 4 and was waiting for 18--in order to get 4, it had to release 18 and reacquire it...so everything steps through the list in the same order, never a deadlock problem.
Thus, both data integrity is preserved and threads do not randomly hang.
sev
 
Howard Kushner
author
Ranch Hand
Posts: 361
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
IMHO this is NOT a beginner question!
 
Dirk Schreckmann
Sheriff
Posts: 7023
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Agreed. Moving this on to the Threads and Synchronization forum...
 
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
Boost this thread!