We always have the problem of having a set of tests which fail one out
of 10 runs, but we leave the test around hoping one day someone will fix
it. The problem is no one does, and it makes regression catching hard.
Right now people that submit pull requests have to scan through test
results and ask around to figure out if they broke something or not.
So I propose a new policy. Any test which intermittently fails will be
ignored and a JIRA opened to the author for up to a month. If that test
is not passing in one month time, it will be removed from the codebase.
The biggest problem with this policy is that we might completely lose
coverage. A number of the clustering tests for example fail
intermittently, and if we removed them we would have no other coverage.
So for special cases like clustering, I am thinking of relocating them
to a different test run called "broken-clustering", or something like
that. This run would only be monitored by those working on clustering,
and would not be included in the main "all tests" run.
Any other ideas?
--
Jason T. Greene
JBoss AS Lead / EAP Platform Architect
JBoss, a division of Red Hat