Test reliability (the above is part of this). We need to try to have more unit and also integration tests and make them more reliable. During the merge we saw test failures on developer machines while Travis was good. It turned out that this was due to timing. In the (RHQ) past we saw test failures because of test ordering. We should perhaps try to make our (integration) tests in random order on purpose, as in reality, the user will not run the code in the order we assume in tests either (yes, that may make setup and tear-down more complex).