[wildfly-dev] How to contribute pull requests to WildFly

Thu Jul 16 17:08:08 EDT 2015

Since migrating to git, and a review-oriented contribution structure, 
we've seen a massive improvement in the quality and quantity of 
community contribution in both WildFly itself and its affiliated 
projects.  However, while we have lots of documents about how to get the 
WildFly code, hack on it, use git to make a branch, submit a pull 
request, etc., one thing rarely (if ever) talked about is any kind of 
standards as to how to actually build and structure your pull requests, 
so I'm going to establish that right now.

I have a few reasons for doing this.  First, the reviewers are stretched 
thin - very thin.  This has several bad effects, including (but not 
limited to):

* Pull requests sitting in the queue for extended periods of time
* Giant pull requests getting less review than small pull requests
* Pull requests receiving highly variable quality-of-review

Secondly, we see problematic pull requests in a wide variety of shapes 
and sizes, from the single-unreviewable-mega-commit PR to the 
thousand-tiny-commit PR to the mixed-form-and-function-commit PR, as 
well as some stealthier cases like the build-is-broken-between-commits PR.

Thirdly, the project has reached a size and scope where we really need 
to have more eyes on every change - as many as possible in fact.

To that effect, and borrowing some concepts heavily from the Linux 
Kernel project's documentation [1], I offer this.  You're welcome.

(For background on how to get started with contribution, the hacking 
guide [2] is still the best place to start; after any initial 
discussion, I'll probably throw this up alongside that.)

WildFly Contribution and Pull Request Standards
-----------------------------------------------
While a complete git tutorial is far, far out of scope of this guide, 
there are a few important rules and guidelines to adhere to when 
creating a pull request for WildFly or one of its constituent or related 
sub-projects.

1) Describe the pull request adequately.

The description *should* include a JIRA number directly from the project 
in question, whose corresponding JIRA issue will in turn have been 
linked to the pull request you are just now creating.  The description 
*should* also include a decent, human-readable summary of what is 
changing.  Proper spelling and grammar is a plus!

2) Make sure it builds and tests pass *first*.

It is highly annoying to reviewers when they find they've spent a great 
deal of time reviewing some code only to discover that it doesn't even 
compile.  In particular, it's common for a patch to trip CheckStyle if 
it hadn't been previously compile-tested at the least.

While it is tempting to rely on the automated CI/GitHub integration to 
do our build and test for us (and I'm guilty of having done this too), 
it generally just causes trouble, so please don't do it!

3) Separate your changes - but not *too* much.

This comes directly from [1], and I agree with it 100%:

"Separate each _logical change_ into a separate patch.

"For example, if your changes include both bug fixes and performance
enhancements for a single driver, separate those changes into two
or more patches.  If your changes include an API update, and a new
driver which uses that new API, separate those into two patches.

"On the other hand, if you make a single change to numerous files,
group those changes into a single patch.  Thus a single logical change
is contained within a single patch.

"The point to remember is that each patch should make an easily understood
change that can be verified by reviewers.  Each patch should be justifiable
on its own merits.

"If one patch depends on another patch in order for a change to be
complete, that is OK.  Simply note "this patch depends on patch X"
in your patch description.

"When dividing your change into a series of patches, take special care to
ensure that [WildFly] builds and runs properly after each patch in the
series.  Developers using "git bisect" to track down a problem can end up
splitting your patch series at any point; they will not thank you if you
introduce bugs in the middle.

"If you cannot condense your patch set into a smaller set of patches,
then only post say 15 or so at a time and wait for review and integration."

I also want to emphasize how important it is to separate *functional* 
and *non-functional* changes.  The latter category includes reformatting 
(which generally should *not* be done without a strong justification).

4) Avoid massive and/or "stream of consciousness" branches

We all know that development can sometimes be an iterative process, and 
we learn as we go.  Nonetheless, we do not need or want a complete 
record of all the highs and lows in the history of every change (for 
example, an "add foobar" commit followed later by a "remove foobar" 
commit in the same PR) - particularly for large changes or in large 
projects (like WildFly proper).  It is good practice for such change 
authors to go back and rearrange and/or restructure the commits of a 
pull request such that they incrementally introduce the change in a 
logical manner, as one single conceptual change per PR.

If a PR consists of dozens or hundreds of nontrivial commits, you will 
want to strongly consider dividing it up into multiple PRs, as PRs of 
this size simply cannot be effectively reviewed.  They will either be 
merged without adequate review, or outright ignored or closed.  Which 
one is worse, I leave to your imagination.

5) Pay attention and respond to review comments

While in general it is my experience that WildFly contributors are good 
about this, I'm going to quote this passage from [1] regardless:

"Your patch will almost certainly get comments from reviewers on ways in
which the patch can be improved.  You must respond to those comments;
ignoring reviewers is a good way to get ignored in return.  [...]

"Be sure to tell the reviewers what changes you are making and to thank them
for their time.  Code review is a tiring and time-consuming process, and
reviewers sometimes get grumpy.  Even in that case, though, respond
politely and address the problems they have pointed out."

In addition, when something needs to be changed, the proper manner to do 
so is generally to modify the original commit, not to add more commits 
to the chain to fix issues as they're reported.  See (4).

6) Don't get discouraged

It may come to pass that you have to iterate on your pull request many 
times before it is considered acceptable.  Don't be discouraged by this 
- instead, consider that to be a sign that the reviewers care highly 
about the quality of the code base.  At the same time though, consider 
that it is frustrating for reviewers to have to say the same things over 
and over again, so please do take care to provide as high-quality 
submissions as possible, and see (5)!

7) You can review code too!

You don't have to be an official reviewer in order to review a pull 
request.  If you see a pull request dealing with an area you are 
familiar with, feel free to examine it and comment as needed.  In 
addition, *all* pull requests need to be reviewed for basic 
(non-machine-verifiable) correctness, including noticing bad code, NPE 
risks, and anti-patterns as well as "boring stuff" like spelling and 
grammar and documentation.

8) On major refactorings

When doing major and/or long-term refactors, while rare, it is possible 
that the above constraints become impractical, especially with regard to 
grouping changes.  In this case, you can use a work branch on a (GitHub) 
fork of WildFly, applying the above rules in micro-scale to just that 
branch.  In this case you could possibly ask a reviewer to also review 
some or all of the pull requests to that branch.  Merge commits would 
then be used to periodically synchronize with upstream.

In this way, when the long-term branch is ready to "come home" to the 
main branch, the reviewers may have a good idea that the (potentially 
quite numerous) changes in the work branch have been reviewed already.

[1] https://www.kernel.org/doc/Documentation/SubmittingPatches
[2] https://developer.jboss.org/wiki/HackingOnWildFly

-- 
- DML