Win a copy of The Little Book of Impediments (e-book only) this week in the Agile and Other Processes forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Transaction & Synchronization

 
Tarun Oohri
Ranch Hand
Posts: 189
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello Everyone,
I was just going through 'Isolation' property of Transaction (ACID). It states that, No two transaction should occur at the same time so as to avoid mixing up of intermediate data. So another transaction should start when the former has been completed on that database. So my question is that, should we keep the 'transaction' code in Synchronized block, so that only one request could do transaction at a time ?
Regards
 
Prasha Bhat
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If User created thread is managing the transactions, it has to synchronized. Unless otherwise there is no need of synchronizing the database code.
 
Campbell Ritchie
Sheriff
Posts: 51441
87
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Moving to our databases forum.

I am not sure but I thunk you can have two transactions simultaneously provided all the data they affect are disjoint. The DBMS will not permit two simultaneous transactions affecting the same data; the second will not be permitted to start until the first has committed.
 
Martin Vajsar
Sheriff
Posts: 3752
62
Chrome Netbeans IDE Oracle
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Transactions are supposed to be run concurrently. The isolation level specifies how much the transactions will influence each other.

Campbell is right that if two transactions decide to modify the same data (usually it is the same row in the database table), the second transaction will have to wait until the first ends (commits or rolls back). This behavior isn't affected by the isolation level, and you generally don't have to care about it yourself (in your code). The database does the syncing. You still need to design things carefully, though, as in the database, as well as in your code when it uses the synchronized keyword, a deadlock can occur.

So, what does isolation level do? Imagine two transactions running concurrently (at the same time). One of the transaction modifies certain row in a table, say, it changes the value of column A from 10 to 20. And now the other transaction tries to read that value. The isolation level determines what the other will "see".

In Read uncommited level (the lowest one), the other transaction will read a value of 20. However, it is not very good, since the first transaction might not actually commit, it might roll back, and then the correct value would be 10 - in a certain sense, the value 20 never existed in the database.

In Read committed level (second lowest), the other transaction will read a value of 10 - the value that was there before the transaction started.

Good! Clearly that's what we wanted. So, what are the higher isolation levels for?

There's the Repeatable reads isolation level. This ensures that the data in your transaction don't change. For example, say that during the other transaction a query was executed, which read the value of column A, and it was 10. Then the first transaction changes the value to 20 and commits, and now the other transaction, for some reason, read the value of column A again. In Read committed level, you would get 20. That's unfortunate, because now you have two values of the same data and don't know which one was right. But with Repeatable reads, you're guaranteed to read the value you obtained on your first query (in our case, 10), forever.

The strongest isolation level is Serializable. This prevents one additional type of inaccuracy from occurring in transaction. Say that you've read all rows from a table in your transaction, and another transaction meanwhile inserts another row and committs. Then in your original transaction you'll re-read the contents of the table. In Serializable isolation level, you'll get the same set of rows you got the first time. In lower isolation levels, the row inserted by another transaction will be read on your second attempt. So, it is similar to Repeatable reads, but now it is extended to newly inserted rows as well.

Obviously, the Serializable level is the best. Your queries will always return the same data, regardless of what other transactions in the database do. Why doesn't everybody just use Serializable and forgets about all the other levels?

The isolation of transactions (now you may also understand why it is called "isolation level") comes at a cost. In some databases, it is ensured by locks. To ensure Read committed, for example, a transaction that wants to read a row modified by another transaction will wait until the other transaction ends. And higher isolation levels require more locks, typically when a row is read by some transaction, it can't be modified until the reading transaction ends. And if the reading transaction generates some reports and spends many minutes reading data for that report, you can easily see that the concurrency in the database suffers. There are other databases that use different mechanism to comply with isolation levels - these databases are able to read the data as they looked in the past, before the modifications done by other transactions were done (so, they use far fewer locks, but sometimes a transaction can fail in them because it isn't possible to ensure the requested isolation level).

If you want to design or program database applications, you should probably know about isolation levels. One of the best explanations I've encountered is here. It deals with Oracle database in the second half, but the first half, where the isolation levels are explained, is true for databases in general.
 
Dave Tolls
Ranch Hand
Posts: 2110
16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Martin Vajsar wrote:
<snipped rest of good stuff>
In some databases, it is ensured by locks. To ensure Read committed, for example, a transaction that wants to read a row modified by another transaction will wait until the other transaction ends.


That's not read committed. Read committed would not have to wait for the commit, it will ignore the uncommitted data.
Unless I'm going doolally.
 
Martin Vajsar
Sheriff
Posts: 3752
62
Chrome Netbeans IDE Oracle
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dave Tolls wrote:That's not read committed. Read committed would not have to wait for the commit, it will ignore the uncommitted data.
Unless I'm going doolally.

I must admit I'm more versed in multiversioning databases. Could you briefly explain how does lock-based database ignore uncommitted data? I had thought it's all about locks.

Edit: I've dug some more and found that MS SQL Server does use locks to prevent read data from being modified in Read committed transactions. There's an option to use multiversioning instead.

MySQL's InnoDB engine always uses multiversioning to ensure Read committed, and uses locks for higher isolation levels (I didn't know that MySQL uses multiversioning).
 
Dave Tolls
Ranch Hand
Posts: 2110
16
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, you're referring to how dbs handle it in real life, and not all of them do.
But "read committed" is supposed to do exactly what it says, allows you to read committed data. SO if someone has locked a row because they are writing to it you should be able to read the pre-commit value (that is, the original value).
That's how Oracle works and, I would hope, SQL Server and MySQL (possibly depending on engine).
Reads shouldn't block.
 
Martin Vajsar
Sheriff
Posts: 3752
62
Chrome Netbeans IDE Oracle
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dave Tolls wrote:But "read committed" is supposed to do exactly what it says, allows you to read committed data. SO if someone has locked a row because they are writing to it you should be able to read the pre-commit value (that is, the original value).

The isolation levels are defined by the ANSI/SQL standard, and aren't defined by what they "do", but by specifying phenomena which can or cannot occur in transactions using that isolation level. In Read uncommitted, dirty reads are allowed. In Read committed, dirty reads must not happen. The ANSI standard certainly doesn't say that a reading transaction must not block when reading a row modified by another unfinished transaction.

Locking alone can't guarantee non-blocking access to "pre-commit" data. For that a different mechanism, sometimes called multiversioning, is generally needed, which is used to obtain the data as it existed before being modified by the other transaction.

That's how Oracle works and, I would hope, SQL Server and MySQL (possibly depending on engine).
Reads shouldn't block.

Yes, reads shouldn't block, but at least in some databases they do. Namely in MS SQL Server before row multiversioning extension was introduced (or when it is switched off), they can block.

Oracle uses multiversioning for everything and only blocks on concurrent writes. It doesn't even block on Serializable transactions if there isn't a concurrent write, but can fail with the "ora-08177 can't serialize access for this transaction" -- see this thread for details.

MySQL (the InnoDB engine) uses something in between - it doesn't block on reads, but probably will block in some situations where Oracle wouldn't (eg. Serializable). MS SQL Server with multiversioning enabled probably (just my guess) works very similarly.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic