[jboss-dev-forums] [JBoss Transactions Development] - TransactionMonitoringAndVisualization
Alex Creasy
do-not-reply at jboss.com
Thu May 9 12:36:41 EDT 2013
Alex Creasy [https://community.jboss.org/people/arcreasy] modified the document:
"TransactionMonitoringAndVisualization"
To view the document, visit: https://community.jboss.org/docs/DOC-48255
--------------------------------------------------------------
h1. Transaction Monitoring and Visualization
h3. Overview of the Tool
The purpose of this project is to create a tool for monitoring the status of in-flight and complete transactions. The main aims of the tool are to help debug transaction related issues and also to provide insight into the running of the system. Essentially, the tool monitors a running system (or a log from a running system) and produces a detailed list of all transactions. A user may then select a particular transaction to see in more detail, including the participants involved in the transaction, the outcome of the transaction and the participants that influenced the outcome. The tool could be visual, showing a diagrammatic view of a transaction or may be command line based, producing a textual output (or maybe both).
h3. What Problems does this solve?
* Transaction Timeout detection. We frequently get support requests (via the forums or elsewhere) from users who are experiencing intermittent rollbacks of transactions due to timeout. It is not a simple matter for the user to figure out that this is what's happening.
* Transaction Profiling. It may be relatively easy to detect transactions that are taking longer than desired to complete. However, diagnosing which party in the transaction is to blame is non-trivial.
* Loops and Diamonds detection. In a distributed transaction it is possible to introduce a loop or a diamond without realizing. JTS may tolerate this, but distributed JTA and bridged WS-AT to JTA does not. Identifying this scenario is non-trivial and requires detailed internal knowledge of the TM.
* Reasoning about the System Architecture. Producing an architectural diagram of a distributed transaction with many participants and servers is not an easy task. Furthermore it can be error prone and could change based on the application inputs. The difficulty of this task is further compounded if multiple transaction types are involved (WS-AT, WS-BA, REST-AT, etc). This diagram is essential for reasoning about the current architecture and for considering improvements or changes. Also this diagram alone may not be enough without detailed profiling overlayed.
* Analyzing Disk Syncs. One approach to improving the performance of a transaction is to reduce the number of disk syncs. The problem is that it is often difficult to calculate exactly how many you are performing and what delay each adds.
h3. Who is the target audience?
* End users, to diagnose their own issues
* GSS, to diagnose customer issues
* Architects, to analyze their system architecture
* Those new to the field of transactions, learning what's going on.
h3. How does it solve these problems?
The tool analyses the server log and gathers data on every transaction ran. The tool could also support live updates for a server that is still running. The tool assumes that it can gather all its information from log statements. If it can't, we take the view that some logging is missing and then add it. We may also want the tool to support different log levels. For example, it may need a log level of TRACE to acquire full knowledge of the system. This may introduce too much of a performance overhead, so we may want to allow the tool to tollerate the reduced information provided by the less verbose log levels. In this case only basic information would be provided.
h6. Querying feature
The data can be queried to find out specific information. For example, show me all the transactions that rolled back. Given a rolledback transaction, the data should be available to diagnose exactly why it rolledback. Other things you may want to query by:
* *Outcome*. Committed, Rolledback, heuristic, etc
* *Type*. JTA, JTS, WS-AT, REST-AT, etc
* *Duration*. Find all fast or slow transactions
* *Inflight*. Show transactions currently inflight
* *Stuck*. Inflight transactions that have been running for longer than Xms.
* *Recovery*. All transactions that needed to be recovered.
h6. Diagnostics
The tool could also allow 'profiles' to be plugged in. A 'profile' is responsible for identifying common problems that we see users having. Hopefully these would allow issues to be identified earlier before they are passed further up the support chain. For example, we may create a 'profile' that searches over the data looking for loops or diamonds. Another could be responsible for identifying timeouts and maybe hinting at their cause. These 'profiles' would be created based on community demand.
h6. Visualization
In order to help architects reason about the system architecture, the tool could produce a visualization of a transaction of interest. A few diagrams should give you an idea of what the tooling could produce:
https://community.jboss.org/servlet/JiveServlet/showImage/102-48255-8-20252/TXVisualCommitted%282%29.png https://community.jboss.org/servlet/JiveServlet/downloadImage/102-48255-8-20252/310-205/TXVisualCommitted%282%29.png
Here we can see that a transaction was begun on Server1. It enlisted a DB and JMS queue locally and invoked a remote service on Server2. A second resource (DB2) was enlisted on Server2.
https://community.jboss.org/servlet/JiveServlet/showImage/102-48255-8-20253/TXVisualRollback.png https://community.jboss.org/servlet/JiveServlet/downloadImage/102-48255-8-20253/310-211/TXVisualRollback.png
In this example we can see that the transaction rolled back because DB2 voted rollback.
Other possible features:
* *Participant types*. Display if the participant is one or two phase aware for JTA/JTS. Display the completion type (Participant/Coordinator) for WS-BA.
* * Resource types*. Whether it's a database or message queue and maybe include the name and version.
* *Multiple Transaction Types*. Somehow depict the type of transaction (JTS, WS-AT, REST-AT, etc).
* *Display Bridges*. Display when a transaction is bridged from one type to another
* *In flight Transactions*. More useful for long running transactions; display the current status of each participant and update the display in real-time as the status changes.
* *Recovery Status*. Show which resources crashed and the status of those resources already committed or recovered.
h3. Reducing the amount of data
It's likely that enabling all the required TRACE logging will generate too large a log, and slow the system down. Therefore to optimize this we could do a number of things:
1. Use Byteman to do the TRACE logging anyway (despite being turned off), for just the items we require
2. Use Byteman to somehow buffer up detailed logging in memory, for all transactions. This is discarded for committed transactions and dumped to the log for failed transactions.
3. Update Narayana to dump a lot of data when the transaction fails. Same as the Byteman approach, but will be useful for more users.
We should also reduce the amount of data held for 'identical' transactions. For example, we may have 1,000's of transactions that follow the same paths, interacting with the same resources. We should simply store this data once along with aggregate data for all the associated transactions.
h3. Research
Things to look into:
* http://pic.dhe.ibm.com/infocenter/wasinfo/v8r5/index.jsp?topic=%2Fcom.ibm.websphere.samples.doc%2Fae%2Fsample_jta.html http://pic.dhe.ibm.com/infocenter/wasinfo/v8r5/index.jsp?topic=%2Fcom.ibm.websphere.samples.doc%2Fae%2Fsample_jta.html* IBM seem to have an API for being notified of all completed transactions.
* https://blog.heroku.com/archives/2013/3/19/log2viz https://blog.heroku.com/archives/2013/3/19/log2viz
* http://logstash.net/ http://logstash.net/
h3. Example - log grepping
Edit the log levels in standalone.xml setting the "com.arjuna" level to TRACE and the CONSOLE logger to TRACE. You'll then see a lot of logging produced by the transaction manager. Here's some interesting messages for different scenarios:
h5. Commit
09:20:38,554 TRACE [com.arjuna.ats.arjuna] (pool-1-thread-1) BasicAction::Begin() for action-id 0:ffffac118223:61d0e901:515016c7:13
[0m09:20:38,558 TRACE [com.arjuna.ats.jta] (pool-1-thread-1) TransactionImple.enlistResource ( org.jboss.narayana.txvis.simple.DummyXAResource at 47b24153 )
[0m09:20:38,563 TRACE [com.arjuna.ats.jta] (pool-1-thread-1) TransactionImple.enlistResource ( org.jboss.narayana.txvis.simple.DummyXAResource at 1bb557c8 )
09:20:38,565 TRACE [com.arjuna.ats.arjuna] (pool-1-thread-1) BasicAction::prepare () for action-id 0:ffffac118223:61d0e901:515016c7:13
09:20:38,566 TRACE [com.arjuna.ats.jta] (pool-1-thread-1) XAResourceRecord.topLevelPrepare for XAResourceRecord < resource:org.jboss.narayana.txvis.simple.DummyXAResource at 47b24153, txid:< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffac118223:61d0e901:515016c7:13, node_name=1, branch_uid=0:ffffac118223:61d0e901:515016c7:14, subordinatenodename=null, eis_name=unknown eis name >, heuristic: TwoPhaseOutcome.FINISH_OK com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord at 6e09388c >
09:20:38,567 TRACE [com.arjuna.ats.jta] (pool-1-thread-1) XAResourceRecord.topLevelPrepare for XAResourceRecord < resource:org.jboss.narayana.txvis.simple.DummyXAResource at 1bb557c8, txid:< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffac118223:61d0e901:515016c7:13, node_name=1, branch_uid=0:ffffac118223:61d0e901:515016c7:18, subordinatenodename=null, eis_name=unknown eis name >, heuristic: TwoPhaseOutcome.FINISH_OK com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord at 1f5fa7c4 >
09:20:38,573 TRACE [com.arjuna.ats.arjuna] (pool-1-thread-1) BasicAction::doCommit (XAResourceRecord < resource:org.jboss.narayana.txvis.simple.DummyXAResource at 47b24153, txid:< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffac118223:61d0e901:515016c7:13, node_name=1, branch_uid=0:ffffac118223:61d0e901:515016c7:14, subordinatenodename=null, eis_name=unknown eis name >, heuristic: TwoPhaseOutcome.FINISH_OK com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord at 6e09388c >)
09:20:38,574 TRACE [com.arjuna.ats.jta] (pool-1-thread-1) XAResourceRecord.topLevelCommit for XAResourceRecord < resource:org.jboss.narayana.txvis.simple.DummyXAResource at 47b24153, txid:< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffac118223:61d0e901:515016c7:13, node_name=1, branch_uid=0:ffffac118223:61d0e901:515016c7:14, subordinatenodename=null, eis_name=unknown eis name >, heuristic: TwoPhaseOutcome.FINISH_OK com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord at 6e09388c >
09:20:38,574 TRACE [com.arjuna.ats.arjuna] (pool-1-thread-1) BasicAction::doCommit (XAResourceRecord < resource:org.jboss.narayana.txvis.simple.DummyXAResource at 1bb557c8, txid:< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffac118223:61d0e901:515016c7:13, node_name=1, branch_uid=0:ffffac118223:61d0e901:515016c7:18, subordinatenodename=null, eis_name=unknown eis name >, heuristic: TwoPhaseOutcome.FINISH_OK com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord at 1f5fa7c4 >)
09:20:38,574 TRACE [com.arjuna.ats.jta] (pool-1-thread-1) XAResourceRecord.topLevelCommit for XAResourceRecord < resource:org.jboss.narayana.txvis.simple.DummyXAResource at 1bb557c8, txid:< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffac118223:61d0e901:515016c7:13, node_name=1, branch_uid=0:ffffac118223:61d0e901:515016c7:18, subordinatenodename=null, eis_name=unknown eis name >, heuristic: TwoPhaseOutcome.FINISH_OK com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord at 1f5fa7c4 >
09:20:38,575 TRACE [com.arjuna.ats.arjuna] (pool-1-thread-1) FileSystemStore.remove_committed(0:ffffac118223:61d0e901:515016c7:13, /StateManager/BasicAction/TwoPhaseCoordinator/AtomicAction)
h5. Client Driven Rollback
[0m10:03:41,737 TRACE [com.arjuna.ats.arjuna] (pool-2-thread-1) BasicAction::Begin() for action-id 0:ffffac118223:-507ee916:515020eb:13
[0m10:03:41,741 TRACE [com.arjuna.ats.jta] (pool-2-thread-1) TransactionImple.enlistResource ( org.jboss.narayana.txvis.simple.DummyXAResource at 5cbf424a )
[0m10:03:41,747 TRACE [com.arjuna.ats.jta] (pool-2-thread-1) TransactionImple.enlistResource ( org.jboss.narayana.txvis.simple.DummyXAResource at 53adb4e5 )
[0m10:03:41,749 TRACE [com.arjuna.ats.arjuna] (pool-2-thread-1) BasicAction::Abort() for action-id 0:ffffac118223:-507ee916:515020eb:13
[0m10:03:41,750 TRACE [com.arjuna.ats.jta] (pool-2-thread-1) XAResourceRecord.topLevelAbort for XAResourceRecord < resource:org.jboss.narayana.txvis.simple.DummyXAResource at 5cbf424a, txid:< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffac118223:-507ee916:515020eb:13, node_name=1, branch_uid=0:ffffac118223:-507ee916:515020eb:14, subordinatenodename=null, eis_name=unknown eis name >, heuristic: TwoPhaseOutcome.FINISH_OK com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord at 6c28e401 >
[0m10:03:41,751 TRACE [com.arjuna.ats.jta] (pool-2-thread-1) XAResourceRecord.topLevelAbort for XAResourceRecord < resource:org.jboss.narayana.txvis.simple.DummyXAResource at 53adb4e5, txid:< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffac118223:-507ee916:515020eb:13, node_name=1, branch_uid=0:ffffac118223:-507ee916:515020eb:18, subordinatenodename=null, eis_name=unknown eis name >, heuristic: TwoPhaseOutcome.FINISH_OK com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord at 70913520 >
h5. Participant Driven Rollback
[0m10:15:52,446 TRACE [com.arjuna.ats.arjuna] (pool-2-thread-1) BasicAction::Begin() for action-id 0:ffffac118223:-6fc6eada:515023cb:13
[0m10:15:52,450 TRACE [com.arjuna.ats.jta] (pool-2-thread-1) TransactionImple.enlistResource ( org.jboss.narayana.txvis.simple.DummyXAResource at 72e0b50 )
[0m10:15:52,455 TRACE [com.arjuna.ats.jta] (pool-2-thread-1) TransactionImple.enlistResource ( org.jboss.narayana.txvis.simple.DummyXAResource at 7aa354c9 )
[0m10:15:52,458 TRACE [com.arjuna.ats.jta] (pool-2-thread-1) XAResourceRecord.topLevelPrepare for XAResourceRecord < resource:org.jboss.narayana.txvis.simple.DummyXAResource at 72e0b50, txid:< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffac118223:-6fc6eada:515023cb:13, node_name=1, branch_uid=0:ffffac118223:-6fc6eada:515023cb:14, subordinatenodename=null, eis_name=unknown eis name >, heuristic: TwoPhaseOutcome.FINISH_OK com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord at 50150c84 >
[0m10:15:52,459 TRACE [com.arjuna.ats.jta] (pool-2-thread-1) XAResourceRecord.topLevelPrepare for XAResourceRecord < resource:org.jboss.narayana.txvis.simple.DummyXAResource at 7aa354c9, txid:< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffac118223:-6fc6eada:515023cb:13, node_name=1, branch_uid=0:ffffac118223:-6fc6eada:515023cb:18, subordinatenodename=null, eis_name=unknown eis name >, heuristic: TwoPhaseOutcome.FINISH_OK com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord at 15e8d72f >
[0m [33m10:15:52,460 WARN [com.arjuna.ats.jta] (pool-2-thread-1) ARJUNA016041: prepare on < formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffac118223:-6fc6eada:515023cb:13, node_name=1, branch_uid=0:ffffac118223:-6fc6eada:515023cb:18, subordinatenodename=null, eis_name=unknown eis name > (org.jboss.narayana.txvis.simple.DummyXAResource at 7aa354c9) failed with exception XAException.XAER_RMERR: javax.transaction.xa.XAException
[0m [33m10:15:52,467 WARN [com.arjuna.ats.arjuna] (pool-2-thread-1) ARJUNA012073: BasicAction.End() - prepare phase of action-id 0:ffffac118223:-6fc6eada:515023cb:13 failed.
[0m [33m10:15:52,467 WARN [com.arjuna.ats.arjuna] (pool-2-thread-1) ARJUNA012075: Action Aborting
[0m10:15:52,467 TRACE [com.arjuna.ats.arjuna] (pool-2-thread-1) BasicAction::phase2Abort() for action-id 0:ffffac118223:-6fc6eada:515023cb:13
[0m10:15:52,468 TRACE [com.arjuna.ats.arjuna] (pool-2-thread-1) BasicAction::doAbort (XAResourceRecord < resource:org.jboss.narayana.txvis.simple.DummyXAResource at 72e0b50, txid:< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffac118223:-6fc6eada:515023cb:13, node_name=1, branch_uid=0:ffffac118223:-6fc6eada:515023cb:14, subordinatenodename=null, eis_name=unknown eis name >, heuristic: TwoPhaseOutcome.FINISH_OK com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord at 50150c84 >)
[0m10:15:52,468 TRACE [com.arjuna.ats.jta] (pool-2-thread-1) XAResourceRecord.topLevelAbort for XAResourceRecord < resource:org.jboss.narayana.txvis.simple.DummyXAResource at 72e0b50, txid:< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffac118223:-6fc6eada:515023cb:13, node_name=1, branch_uid=0:ffffac118223:-6fc6eada:515023cb:14, subordinatenodename=null, eis_name=unknown eis name >, heuristic: TwoPhaseOutcome.FINISH_OK com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord at 50150c84 >
[0m10:15:52,468 TRACE [com.arjuna.ats.arjuna] (pool-2-thread-1) BasicAction::doAbort (XAResourceRecord < resource:org.jboss.narayana.txvis.simple.DummyXAResource at 7aa354c9, txid:< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffac118223:-6fc6eada:515023cb:13, node_name=1, branch_uid=0:ffffac118223:-6fc6eada:515023cb:18, subordinatenodename=null, eis_name=unknown eis name >, heuristic: TwoPhaseOutcome.FINISH_OK com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord at 15e8d72f >)
[0m10:15:52,469 TRACE [com.arjuna.ats.jta] (pool-2-thread-1) XAResourceRecord.topLevelAbort for XAResourceRecord < resource:org.jboss.narayana.txvis.simple.DummyXAResource at 7aa354c9, txid:< formatId=131077, gtrid_length=29, bqual_length=36, tx_uid=0:ffffac118223:-6fc6eada:515023cb:13, node_name=1, branch_uid=0:ffffac118223:-6fc6eada:515023cb:18, subordinatenodename=null, eis_name=unknown eis name >, heuristic: TwoPhaseOutcome.FINISH_OK com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord at 15e8d72f >
h3.
h3. Data Model
The basic data model currently used in early iterations of the tool is described in the following ER diagram
https://community.jboss.org/servlet/JiveServlet/showImage/102-48255-8-20766/txvis-entities.png https://community.jboss.org/servlet/JiveServlet/downloadImage/102-48255-8-20766/450-186/txvis-entities.png
--------------------------------------------------------------
Comment by going to Community
[https://community.jboss.org/docs/DOC-48255]
Create a new document in JBoss Transactions Development at Community
[https://community.jboss.org/choose-container!input.jspa?contentType=102&containerType=14&container=2041]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/jboss-dev-forums/attachments/20130509/377f1a8a/attachment-0001.html
More information about the jboss-dev-forums
mailing list