Friday, July 14, 2006

Distributed Transactions

Transactions Series Part 2

A great article - http://www.subbu.org/articles/transactions/NutsAndBoltsOfTP.html

Distributed Transactions

In the previous part, we saw the properties of a transaction. When an application is dealing with a single database, the features are the responsibility of the database provider.

However, in the case of a network scenario, where the application could be composed of multiple distributed components and a unit of work dealing with mulitple components dealing with multiple data sources, the notion of a transaction gets complicated. In such cases, it may be required that a group of operations on (distributed) resources be treated as one unit of work. Such applications should maintain integrity of data (as defined by the business rules of the application) under the following circumstances
  • distributed access to a single resource of data
  • access to distributed resources from a single application component
The responsibility of maintenance of the state of the operation regarding whether the operation is successful or is a failure falls on the application. Such systems can then be easily realised by a transaction manager where the application talks to the transaction manager to start, commit and rollback an operation. The transaction manager itself deals with the multiple resource managers (each database could be a resource manager) managing multiple resources and driving a protocol to commit or rollback for the Atomicity feature and also a recovery protocol for the durability feature.

Distriuted Transaction Processing Architecture

It has the following components -
  • Application Components - Application components are clients for the transactional resources. These are the programs with which the application developer implements business transactions.
    With the help of the transaction manager, these components create global transactions, propagate the transaction context if necessary, and operate on the transactional resources with in the scope of these transactions. These components are not responsible for implementing semantics for preserving ACID properties of transactions. However, as part of the application logic, these components generally make a decision whether to commit or rollback transactions. It has the following responsibilities - (1) Create and demarcate transactions (2) Propagate transaction context (3) Operate on data via resource managers
  • Resource Manager - resource manager is a component that manages persistent and stable data storage system, and participates in the two phase commit and recovery protocols with the transaction manager.
    A resource manager is typically a driver or a wrapper over a stable storage system, with interfaces for operating on the data (for the application components), and for participating in two phase commit and recovery protocols coordinated by a transaction manager. This component may also, directly or indirectly, register resources with the transaction manager so that the transaction manager can keep track of all the resources participating in a transaction. This process is called as resource enlistment. For implementing the two-phase commit and recovery protocols, the resource manager should implement supplementary mechanisms using which recovery is possible.
    Resource managers provide two sets of interfaces: one set for the application components to get connections and perform operations on the data, and the other set for the transaction manager to participate in the two-phase commit and recovery protocol. It has the following responsibilities - (1) Enlist resources with the transaction manager (2) Participate in two-phase commit protocol and recovery protocol.
  • Transaction Manager - The transaction manager is the core component of a transaction processing environment. Its primary responsibilities are to create transactions when requested by application components, allow resource enlistment and delistment, and to conduct the two-phase commit or recovery protocol with the resource managers.
    A typical transactional application begins a transaction by issuing a request to a transaction manager to initiate a transaction. In response, the transaction manager starts a transaction and associates it with the calling thread. The transaction manager also establishes a transaction context. All application components and/or threads participating in the transaction share the transaction context. The thread that initially issued the request for beginning the transaction, or, if the transaction manager allows, any other thread may eventually terminate the transaction by issuing a commit or rollback request.
    Before a transaction is terminated, any number of components and/or threads may perform transactional operations on any number of transactional resources known to the transaction manager. If allowed by the transaction manager, a transaction may be suspended or resumed before finally completing the transaction.
    Once the application issues the commit request, the transaction manager prepares all the resources for a commit operation (by conducting a voting), and based on whether all resources are ready for a commit or not, issues a commit or rollback request to all the resources. It has the following responsibilities - (1) Establish and maintain transaction context (2) Maintain association between a transaction and the participating resources (3) Initiate and conduct two-phase commit and recovery protocol with the resource managers (4) Make synchronization calls to the application components before beginning and after end of two-phase commit and recovery process
Transaction Processing Concepts
  • Transaction Demarcation - A transaction can be specified by what is known as transaction demarcation. Transaction demarcation enables work done by distributed components to be bound by a global transaction. It is a way of marking groups of operations to constitute a transaction.
    The most common approach to demarcation is to mark the thread executing the operations for transaction processing. This is called as programmatic demarcation. The transaction so established can be suspended by unmarking the thread, and be resumed later by explicitly propagating the transaction context from the point of suspension to the point of resumption.
    The transaction demarcation ends after a commit or a rollback request to the transaction manager. The commit request directs all the participating resources managers to record the effects of the operations of the transaction permanently. The rollback request makes the resource managers undo the effects of all operations on the transaction.
    An alternative to programmatic demarcation is declarative demarcation. Component based transaction processing systems such as Microsoft Transaction Server, and application servers based on the Enterprise Java Beans specification support declarative demarcation. In this technique, components are marked as transactional at the deployment time. This has two implications. Firstly, the responsibility of demarcation is shifted from the application to the container hosting the component. For this reason, this technique is also called as container managed demarcation. Secondly, the demarcation is postponed from application build time (static) to the component deployment time (dynamic).
  • Transaction Context and Propagation - Since multiple application components and resources participate in a transaction, it is necessary for the transaction manager to establish and maintain the state of the transaction as it occurs. This is usually done in the form of transaction context.
    Transaction context is an association between the transactional operations on the resources, and the components invoking the operations. During the course of a transaction, all the threads participating in the transaction share the transaction context. Thus the transaction context logically envelops all the operations performed on transactional resources during a transaction. The transaction context is usually maintained transparently by the underlying transaction manager.
  • Resource Enlistment- Resource enlistment is the process by which resource managers inform the transaction manager of their participation in a transaction. This process enables the transaction manager to keep track of all the resources participating in a transaction. The transaction manager uses this information to coordinate transactional work performed by the resource managers and to drive two-phase commit and recovery protocol.
    At the end of a transaction (after a commit or rollback) the transaction manager delists the resources. Thereafter, association between the transaction and the resources does not hold.
  • Two-Phase Commit - This protocol between the transaction manager and all the resources enlisted for a transaction ensures that either all the resource managers commit the transaction or they all abort. In this protocol, when the application requests for committing the transaction, the transaction manager issues a prepare request to all the resource managers involved. Each of these resources may in turn send a reply indicating whether it is ready for commit or not. Only when all the resource managers are ready for a commit, does the transaction manager issue a commit request to all the resource managers. Otherwise, the transaction manager issues a rollback request and the transaction will be rolled back.
s

0 Comments:

Post a Comment

<< Home