Marco Ehrentreich wrote:The EmailAddress entities have an id field of type "Long" as a surrogate key for JPA. The natural key to determine if an email address already exists is simply the normalized email address itself as a String/VARCHAR.
Although I haven't worked on this issue any more I'm still surprised that there were no answers to my post. I guess the problem is either too easy to even think about it or I was doing something completely wrong. Maybe the simplest solution is to create a database index on the email address and just try to persist any new address. This should raise an exception if an address already exists and the email address can then be read from the database and reused in this case.
Anyway I'd still like to hear some best practices!
Marco
Programming is about thinking, NOT coding
a) Use Serializable Isolation level
If you use serializable isolation level, the transaction that first gets to the query will succeed. Any transaction that tries the insert while the first transaction is still active will fail. You will have to catch the exception in code and retry
b) Lock the table
We are using Oracle, and you can put an exclusive lock on the table. The transaction that gets to the lock first will succeed. Subsequent transactions will wait until the first transaction commits/rolls back.
The problem is when 2 transactions try to run the same query concurrently, there are dupes in OLAP DIM table. The inner select query doesn't return any matching records in both transactions, and both transactions insert records
Bill Gorder wrote:
a) Use Serializable Isolation level
If you use serializable isolation level, the transaction that first gets to the query will succeed. Any transaction that tries the insert while the first transaction is still active will fail. You will have to catch the exception in code and retry
I was aware of this but it has potential performance ramifications, so I did not mention it. It is one of those ask your DBA things I think.
b) Lock the table
We are using Oracle, and you can put an exclusive lock on the table. The transaction that gets to the lock first will succeed. Subsequent transactions will wait until the first transaction commits/rolls back.
This too can cause bottle necks. I prefer the work queue with a single threaded consumer to this approach.
The problem is when 2 transactions try to run the same query concurrently, there are dupes in OLAP DIM table. The inner select query doesn't return any matching records in both transactions, and both transactions insert records
Sounds like you need to set your constraints to prevent duplicates. While it may throw an exception catching and handling this is far better than duplicates in the database (assuming duplicates are not acceptable)
With JPA synchronization is a bit trickier because the same entity manger should not be accessed by multiple threads (they are not thread safe)
Unfortunately, there are no "good" solutions here.
permaculture is a more symbiotic relationship with nature so I can be even lazier. Read tiny ad:
a bit of art, as a gift, the permaculture playing cards
https://gardener-gift.com
|