[jbossts-issues] [JBoss JIRA] (JBTM-949) Automate the verification of trace output from the XTS crash recovery tests
Paul Robinson (JIRA)
jira-events at lists.jboss.org
Thu May 10 08:57:19 EDT 2012
[ https://issues.jboss.org/browse/JBTM-949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12691819#comment-12691819 ]
Paul Robinson commented on JBTM-949:
We have now included the XTS recovery tests in CI. This job automated the process of running the tests and generating the trace logs for inspection. Since then we've had many failures from these tests. Two bugs in the code were found and a few bugs in the tests. All these failures where due to timing issues that do not seem to occur on a fast machine. Hence the reason why they where not spotted until now, as the recovery tests have always been ran on powerful machines.
Whilst fixing these issues, I've had chance to understand, a lot better, how these tests work, and we've come to the conclusion that the automated part of the test actually tests everything that really matters. The Byteman script is used to crash the system when it reaches a particular state. If this state is not met, then the test fails. The system is then recovered and the tests check that the right outcome is sent to all participants and that the TX log is tidied up. Again, the test fails if this was not the case. The additional benefit of checking the trace logs is that we can ensure the correct path was taken throughout the test. However, this process is very time consuming and, in my experience, has never shown up any bugs, other than cosmetic issues with the Byteman scripts (usually missmatch between the log messages and the expected log message). These logs are very useful when tracking down the cause of a failure, so we should still keep them, but not verify them by eye during each release.
> Automate the verification of trace output from the XTS crash recovery tests
> Key: JBTM-949
> URL: https://issues.jboss.org/browse/JBTM-949
> Project: JBoss Transaction Manager
> Issue Type: Enhancement
> Security Level: Public(Everyone can see)
> Components: XTS
> Reporter: Paul Robinson
> Assignee: Paul Robinson
> Fix For: 5.0.0.M2
> Currently it is very difficult to verify the trace output from the XTS crash recovery tests. With the current code it is infeasible to run multiple servers as the trace output will span many files making it difficult to establish the correct order in which events occurred.
> I think, the following changes will make the test verification automatic and the tests scalable to many participants:
> # Carry out assertions in Byteman as the test progresses. Assertions at Runtime should be more flexible as more info is available.
> # Each participant is concerned only with the correctness of their own participation. This is key to scalability to many participants.
> # Anything that can't be solved by the above is dumped to one trace file per server and is hopefuly simple enough for scriptable post-verification.
> I think my idea needs prototyping first to check that it is feasible in practice.
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the jbossts-issues