This article will introduce you to the idea of a transaction which is frequently seen in Hibernate and software systems overall. You’ll understand why it’s so important and recognize the main issues which may arise when multiple transactions disturb each other.
What are transactions and why do we need them ?
Imagine you’re shopping for groceries online. The software system, which supports this process, needs to ensure your products are reserved in the warehouse, payment is collected and a request is sent to a delivery unit so that the shipment process can start. These operations need to be done in one go. If the payment fails, company will give out their products for free, if the delivery service misses the request, you end up paying for nothing. When the products are not reserved in the warehouse, the company will end up selling products which they don’t really have.
A set of operations which are executed together is called a unit of work. In a multithreaded application, same units of work can compete with each other and try to access shared resources at the same time. For example, they may try to reserve products at the same time overwriting each other’s data and affecting its integrity.
A transaction is what helps keep a unit of work atomic and isolated from other units of work. With the help of a transaction, a unit of work either completes as a whole or reverts as it never existed. In other words, it helps ensure the result of competing units of work is predictable and does not break data integrity. That is why each unit of work performing data modification, is executed within the context of a transaction.
Even though our update operations are backed up by transactions, there are few things which can go wrong. Depending on how isolated we set the transactions to be, there are different ways in which they can interfere with each other.
Transaction isolation issues
Transaction isolation issues describe situations when two or more transactions, running concurrently, access the same data item and disturb each other. To make it easier to understand, I’ll use database row or table as an example of a shared resource but bear in mind that in case of JTA transactions spanning multiple systems, it can be database rows, files, queues etc. Here’s what can go wrong when two transactions compete for the same resource.
Lost update happens when two transactions update the same row, one after another and then the latter transaction reverts, taking data back to the state before triggering both transactions. From the data standpoint, it seems like no transaction was ever executed.
We talk about dirty read when transaction two reads an uncommitted update to a row, made by transaction one. When transaction one reverts, transaction two stays with data which never existed.
When transaction one reads the same database row twice and each time the row carries different data, this is when we’re talking about unrepeatable read. It may be caused by transaction two, which committed an update in between both SELECTs in transaction one.
A phantom read occurs When transaction one reads a data set twice and the data set differs between both requests. Imagine transaction one executing a query which counts the number of rows in a table twice, and transaction two adding (or removing) some rows to the same table in between the reads.
Last Update wins
The last update wins problem occurs when two transactions read a database row at the same time and transaction one updates and commits before transaction two does. The update made by transaction one is lost without any warning and transaction two might have made a decision unaware of A’s modifications.
To handle the above issues, transactions allow for setting various isolation levels. Hibernate also supports setting an isolation level for each transaction. This is what’s coming up next week.