• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Nested TransactionS

 
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
We are using spring 2.5 with JPA [Hibernate Implementation]

We have a below business scenario, where in a monthly template has to be created in tables as per below structure.

ORG1
each ORG has Models[1 -30]
each Model under a ORG has Options[15-30]
ORG2
ORG3
.
.
ORG20

Tables Hierarchy/Relationship is ORG --> Model --> Options. In order create this structure we refer Master tables & do some processing which takes time...

Currently for the above scenario Transaction is handled pragmatically as below

Open Transaction
ORG 1
Commit

Open Transaction
Model 1
Commit

Open Transaction
Option 1
Commit

Same is done for the entire hierarchy. The draw back here is, if an Exception occurs at a particular operation, only that specific operation will be rolled back, other relation data / operations will not, In order to overcome this, when exception occurs he had to write a component which will delete already committed data.

We tried having an only one Transaction at outer most Layers. Since the template creation takes 20-30 min, the entire tables gets locked, there are other modules which access these tables will stop responding.

Open Transaction
Org1
each Org has Models[1 -30]
each Model under a ORG has Options[15-30]
Org2
Org3
.
.
Org20
Commit


Wanted to know is there a possibility to achieve the above business scenario using Nested transactions without locking the tables for long time and also consider the entire operations as one unit of work, if processing fails rollback the entire processing, rather than have a component to delete the data.

Suggestions are Welcome

Thanks in Advance
 
Rancher
Posts: 2759
32
Eclipse IDE Spring Tomcat Server
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Nested transactions, even if they were possible, wouldn't have helped you because your problem isn't a lack of nested transactions. Your problem is that the transactions are too long. You need to reduce the scope of transactions. Take the time intensive operations out of the transactions. If template creations is taking 20-30 mins, do it out of the transaction


And why is creating a template taking 20-30 min? You can do millions of operations in 20 min. What are you doing in there? It might be worth looking at optimizing template creation process.
 
Praveen Ramakrishna
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Jayesh Thank you for the suggestion.

I do agree with you on reducing the scope on Transactions.

On why template creation taking 20-30 min? Yes at average we do 9 Million DB operations in order to create a Template, before these 9 million DB operations we do processing to create these templates. Let me give you more details...These templates are nothing but Sales Forecast templates created for a Product Hierarchy Combination.

What we are trying to achieve now is
- Make the scope of transactions minimal [ Which we are currently working on]
- Next, How do we overcome the task of deleting the data manually if the process fails in b/w, Can't we achieve this by Transactions with out locking the DB table's?

Thank you
Praveen
 
Jayesh A Lalwani
Rancher
Posts: 2759
32
Eclipse IDE Spring Tomcat Server
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Do the templates have lot of data in them? Or do you aggregate lot of data and spit out the results in the template

You probably won't be able to use traditional database transactions to achieve what you want to do if you have huge operations. Normally, when you have a lot of intensive operations going against the database, you divide the work into parts, and work on each part at a time. Also, you put some sort of available flag in the database that indicates to the component that is going to use this data that the data is available for use. You set the flag to false while you are working on the data, and set it to true when you are done. End of the day, you can clean up orphaned jobs during a maintanence window


 
Praveen Ramakrishna
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes, we do have lot of data, also we aggregate the data from bottom of the product hierarchy to the top level. apart from this we have to process the historical data for the templates.

Currently the work is split to individual components. But what i am looking for Is there a way I don't have to manually delete the data which is already successfully processed by other components. For me the unit of work should be either the template is full created else not.

In components based approach, I have divided the work...Each component does it work but if any one of them fail...Entire processing/data persisted should be rolled back [with out any manual approach]
 
Jayesh A Lalwani
Rancher
Posts: 2759
32
Eclipse IDE Spring Tomcat Server
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It might make sense from the POV of programming effort involved that you put a huge operation in one transaction. However, from how the database works, it;s not a good idea to put long running operations that update a lot of rows in a transaction.

1) Most databases generally keep a redo log of the currently open transactions. ALso, generally, the redo logs are kept in the more costlier storage areas, since most operations do frequent writes to redo logs. This could mean that the redo log is either in memory, or in one of the faster disks. It's never a good idea to force gobs of data into memory. For one, it will force other data out of buffers, and other transactions going on at the same time will take a hit

2) A transaction will lock records. Various database have differrent rules for locking. However,there are locks of some kind, and locks introduce bottlenecks in the system. The database uses locks to prevent the data from getting dirty. The side effect of a lock is other transactions trying to update the same record will be blocked. This introduces the bottleneck. Your system is not scalable.


IMO, you should go back to having smaller transactions, and taking the hit of cleaning out dirty data afterwards. Transactions are meant for "small" operations where the consistency of data is important enough that you are willing to pay the price of introducing bottlenecks in the system

If I were you, I would also take a closer look at your design. What you have described here is a classic ETL job. You are running some huge operations to extract the data, transform it and spit out a "Template" (which I am assuming is some sort of file that either a human or an external system can read). Most ETL frameworks do not open transactions, and put write locks on the system exactly for the reasons mentioned above. Actually, most databases have utilities that can quickly extract data in CSV format. A lot of ETL frameworks leverage that by taking a quick extract out of the database, and then doing transformations on the extracted data. In most cases, you must avoid locking records that other transactions are going to be using for an extended period of time.

I'm not sure why you have the need to update the database when you are trying to extract data out of it. It's rather unusual
 
Praveen Ramakrishna
Greenhorn
Posts: 4
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Jayesh, Thank you for the detailed Inputs.

I'm not sure why you have the need to update the database when you are trying to extract data out of it. It's rather unusual



As i mentioned earlier, its a Sales Forecasting System, we will have to create templates for every month, When we are created a template for a month we will have to copy previous month template add new month details on top of it... That's the reason we have to extract the data, do small bit of transformation and then use it to create new template...

As pointed by out, we are currently in process of re-designing the existing system. Before finalizing on the solution we wanted to check the possible solutions and evaluate b/w them...

1) Most databases generally keep a redo log of the currently open transactions. ALso, generally, the redo logs are kept in the more costlier storage areas, since most operations do frequent writes to redo logs



Do SavePoints also fall in this category? Where in all the transactions/operations after a savepoint are part of redo logs?
 
I suggest huckleberry pie. But the only thing on the gluten free menu is this tiny ad:
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
reply
    Bookmark Topic Watch Topic
  • New Topic