On 07/17/2015 12:26 PM, Jason Greene wrote:
Hello Everyone!
Over the last few years we have attempted to get a grip on the intermittent test failures
we see in CI almost every run, and have attempted various policies which were poorly
enforced. This was mainly due to an understanding of tight deadline, but also a concern
that strict enforcement would damage our test coverage. I would find myself holding out in
hope that over time we would see these issues decrease.
Unfortunately that hope was misplaced, and the simple truth is that coverage is already
harmed, because no one pays attention to the intermittent fails, since they are treated as
noise and not a true defect. I think we have no other option but to @Ignore any and every
intermittent test, no matter the number or area they appear in.
So unless someone can propose another workable strategy, intermittent tests should be
disabled immediately by anyone that is impacted by them (reviewers, contributors,
whoever). To do this, a simple short independent PR should be sent in, and a note should
be left in a comment to the component lead of the area tested (using github’s super cool
@user syntax). The component lead can then act on the message and choose whether to open
and assign a JIRA to have it fixed, to ignore it, to delete it, to rewrite, etc. The
intention is that the process of disabling an intermittent test should be as minimal
effort as possible.
In order for a known intermittent test to be reenabled, it needs to be shown that it is
no longer intermittent by completing many full testsuite runs successfully. Anyone that
needs it can request a custom CI job run for this purpose.
Finally, if you are a saint and decide to fix an intermittent failing test, but have
trouble reproducing it, feel free to raise the topic in the WildFly HipChat and we can
come up with a solution like custom debug test jobs that can get you the info you need.
Thoughts?
It may not always be obvious but if multiple components might be
impacted by ignoring the test, we should try to contact the multiple
impacted parties. I only got burnt by this happening once (took me
about 3 months before I noticed :-), so not really sure if its worth
adding comments in test source headers about who to contact in case of
intermittment failure (and subsequent @Ignore of the test) but that
might help.
--
Jason T. Greene
WildFly Lead / JBoss EAP Platform Architect
JBoss, a division of Red Hat
_______________________________________________
wildfly-dev mailing list
wildfly-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/wildfly-dev