JBoss Community

TransactionMonitoringAndVisualization

modified by Libor Krzyžanek in JBoss Transactions Development - View the full document

Transaction Monitoring and Visualization

 

Overview of the Tool

 

The purpose of this project is to create a tool for monitoring the status of in-flight and complete transactions. The main aims of the tool are to help debug transaction related issues and also to provide insight into the running of the system. Essentially, the tool monitors a running system (or a log from a running system) and produces a detailed list of all transactions. A user may then select a particular transaction to see in more detail, including the participants involved in the transaction, the outcome of the transaction and the participants that influenced the outcome. The tool could be visual, showing a diagrammatic view of a transaction or may be command line based, producing a textual output (or maybe both).

 

What Problems does this solve?

 

  • Transaction Timeout detection. We frequently get support requests (via the forums or elsewhere) from users who are experiencing intermittent rollbacks of transactions due to timeout. It is not a simple matter for the user to figure out that this is what's happening.

 

  • Transaction Profiling. It may be relatively easy to detect transactions that are taking longer than desired to complete. However, diagnosing which party in the transaction is to blame is non-trivial.

 

  • Loops and Diamonds detection. In a distributed transaction it is possible to introduce a loop or a diamond without realizing. JTS may tolerate this, but distributed JTA and bridged WS-AT to JTA does not. Identifying this scenario is non-trivial and requires detailed internal knowledge of the TM.

 

  • Reasoning about the System Architecture. Producing an architectural diagram of a distributed transaction with many participants and servers is not an easy task.  Furthermore it can be error prone and could change based on the application inputs. The difficulty of this task is further compounded if multiple transaction types are involved (WS-AT, WS-BA, REST-AT, etc). This diagram is essential for reasoning about the current architecture and for considering improvements or changes. Also this diagram alone may not be enough without detailed profiling overlayed.

 

  • Analyzing Disk Syncs. One approach to improving the performance of a transaction is to reduce the number of disk syncs. The problem is that it is often difficult to calculate exactly how many you are performing and what delay each adds.

 

 

Who is the target audience?

  • End users, to diagnose their own issues
  • GSS, to diagnose customer issues
  • Architects, to analyze their system architecture
  • Those new to the field of transactions, learning what's going on.

 

 

How does it solve these problems?

 

The tool analyses the server log and gathers data on every transaction ran. The tool could also support live updates for a server that is still running. The tool assumes that it can gather all its information from log statements. If it can't, we take the view that some logging is missing and then add it. We may also want the tool to support different log levels. For example, it may need a log level of TRACE to acquire full knowledge of the system. This may introduce too much of a performance overhead, so we may want to allow the tool to tollerate the reduced information provided by the less verbose log levels. In this case only basic information would be provided.

 

Querying feature

The data can be queried to find out specific information. For example, show me all the transactions that rolled back. Given a rolledback transaction, the data should be available to diagnose exactly why it rolledback. Other things you may want to query by:

 

  • Outcome. Committed, Rolledback, heuristic, etc
  • Type. JTA, JTS, WS-AT, REST-AT, etc
  • Duration. Find all fast or slow transactions
  • Inflight. Show transactions currently inflight
  • Stuck. Inflight transactions that have been running for longer than Xms.
  • Recovery. All transactions that needed to be recovered.

 

Diagnostics

The tool could also allow 'profiles' to be plugged in. A 'profile' is responsible for identifying common problems that we see users having. Hopefully these would allow issues to be identified earlier before they are passed further up the support chain. For example, we may create a 'profile' that searches over the data looking for loops or diamonds. Another could be responsible for identifying timeouts and maybe hinting at their cause. These 'profiles' would be created based on community demand.

 

Visualization

 

In order to help architects reason about the system architecture, the tool could produce a visualization of a transaction of interest. A few diagrams should give you an idea of what the tooling could produce:

 

https://community.jboss.org/servlet/JiveServlet/downloadImage/102-48255-2-20252/310-205/TXVisualCommitted%282%29.png

 

Here we can see that a transaction was begun on Server1. It enlisted a DB and JMS queue locally and invoked a remote service on Server2. A second resource (DB2) was enlisted on Server2.

 

https://community.jboss.org/servlet/JiveServlet/downloadImage/102-48255-2-20253/310-211/TXVisualRollback.png

 

In this example we can see that the transaction rolled back because DB2 voted rollback.

 

Other possible features:

 

  •     Participant types. Display if the participant is one or two phase aware for JTA/JTS. Display the completion type (Participant/Coordinator) for WS-BA.
  •     Resource types. Whether it's a database or message queue and maybe include the name and version.
  •     Multiple Transaction Types. Somehow depict the type of transaction (JTS, WS-AT, REST-AT, etc).
  •     Display Bridges. Display when a transaction is bridged from one type to another
  •     In flight Transactions. More useful for long running transactions; display the current status of each participant and update the display in real-time as the status changes.
  •     Recovery Status. Show which resources crashed and the status of those resources already committed or recovered.

Comment by going to Community

Create a new document in JBoss Transactions Development at Community