Things to consider. Forwarded from thecore -- P
-------- Forwarded Message --------
Subject: Writing good pull requests (Fwd: [wildfly-dev] How to
contribute pull requests to WildFly)
Date: Thu, 16 Jul 2015 16:30:26 -0500
From: David M. Lloyd <david.lloyd(a)redhat.com>
To: The Core <thecore(a)redhat.com>
I recently sent this to wildfly-dev. I think however that these
guidelines are probably applicable to many, if not all, of our git-based
OSS projects, so if any of you are interested in this topic, here you go.
-------- Forwarded Message --------
Subject: [wildfly-dev] How to contribute pull requests to WildFly
Date: Thu, 16 Jul 2015 16:08:08 -0500
From: David M. Lloyd <david.lloyd(a)redhat.com>
To: WildFly Dev <wildfly-dev(a)lists.jboss.org>
Since migrating to git, and a review-oriented contribution structure,
we've seen a massive improvement in the quality and quantity of
community contribution in both WildFly itself and its affiliated
projects. However, while we have lots of documents about how to get the
WildFly code, hack on it, use git to make a branch, submit a pull
request, etc., one thing rarely (if ever) talked about is any kind of
standards as to how to actually build and structure your pull requests,
so I'm going to establish that right now.
I have a few reasons for doing this. First, the reviewers are stretched
thin - very thin. This has several bad effects, including (but not
limited to):
* Pull requests sitting in the queue for extended periods of time
* Giant pull requests getting less review than small pull requests
* Pull requests receiving highly variable quality-of-review
Secondly, we see problematic pull requests in a wide variety of shapes
and sizes, from the single-unreviewable-mega-commit PR to the
thousand-tiny-commit PR to the mixed-form-and-function-commit PR, as
well as some stealthier cases like the build-is-broken-between-commits PR.
Thirdly, the project has reached a size and scope where we really need
to have more eyes on every change - as many as possible in fact.
To that effect, and borrowing some concepts heavily from the Linux
Kernel project's documentation [1], I offer this. You're welcome.
(For background on how to get started with contribution, the hacking
guide [2] is still the best place to start; after any initial
discussion, I'll probably throw this up alongside that.)
WildFly Contribution and Pull Request Standards
-----------------------------------------------
While a complete git tutorial is far, far out of scope of this guide,
there are a few important rules and guidelines to adhere to when
creating a pull request for WildFly or one of its constituent or related
sub-projects.
1) Describe the pull request adequately.
The description *should* include a JIRA number directly from the project
in question, whose corresponding JIRA issue will in turn have been
linked to the pull request you are just now creating. The description
*should* also include a decent, human-readable summary of what is
changing. Proper spelling and grammar is a plus!
2) Make sure it builds and tests pass *first*.
It is highly annoying to reviewers when they find they've spent a great
deal of time reviewing some code only to discover that it doesn't even
compile. In particular, it's common for a patch to trip CheckStyle if
it hadn't been previously compile-tested at the least.
While it is tempting to rely on the automated CI/GitHub integration to
do our build and test for us (and I'm guilty of having done this too),
it generally just causes trouble, so please don't do it!
3) Separate your changes - but not *too* much.
This comes directly from [1], and I agree with it 100%:
"Separate each _logical change_ into a separate patch.
"For example, if your changes include both bug fixes and performance
enhancements for a single driver, separate those changes into two
or more patches. If your changes include an API update, and a new
driver which uses that new API, separate those into two patches.
"On the other hand, if you make a single change to numerous files,
group those changes into a single patch. Thus a single logical change
is contained within a single patch.
"The point to remember is that each patch should make an easily understood
change that can be verified by reviewers. Each patch should be justifiable
on its own merits.
"If one patch depends on another patch in order for a change to be
complete, that is OK. Simply note "this patch depends on patch X"
in your patch description.
"When dividing your change into a series of patches, take special care to
ensure that [WildFly] builds and runs properly after each patch in the
series. Developers using "git bisect" to track down a problem can end up
splitting your patch series at any point; they will not thank you if you
introduce bugs in the middle.
"If you cannot condense your patch set into a smaller set of patches,
then only post say 15 or so at a time and wait for review and integration."
I also want to emphasize how important it is to separate *functional*
and *non-functional* changes. The latter category includes reformatting
(which generally should *not* be done without a strong justification).
4) Avoid massive and/or "stream of consciousness" branches
We all know that development can sometimes be an iterative process, and
we learn as we go. Nonetheless, we do not need or want a complete
record of all the highs and lows in the history of every change (for
example, an "add foobar" commit followed later by a "remove foobar"
commit in the same PR) - particularly for large changes or in large
projects (like WildFly proper). It is good practice for such change
authors to go back and rearrange and/or restructure the commits of a
pull request such that they incrementally introduce the change in a
logical manner, as one single conceptual change per PR.
If a PR consists of dozens or hundreds of nontrivial commits, you will
want to strongly consider dividing it up into multiple PRs, as PRs of
this size simply cannot be effectively reviewed. They will either be
merged without adequate review, or outright ignored or closed. Which
one is worse, I leave to your imagination.
5) Pay attention and respond to review comments
While in general it is my experience that WildFly contributors are good
about this, I'm going to quote this passage from [1] regardless:
"Your patch will almost certainly get comments from reviewers on ways in
which the patch can be improved. You must respond to those comments;
ignoring reviewers is a good way to get ignored in return. [...]
"Be sure to tell the reviewers what changes you are making and to thank them
for their time. Code review is a tiring and time-consuming process, and
reviewers sometimes get grumpy. Even in that case, though, respond
politely and address the problems they have pointed out."
In addition, when something needs to be changed, the proper manner to do
so is generally to modify the original commit, not to add more commits
to the chain to fix issues as they're reported. See (4).
6) Don't get discouraged
It may come to pass that you have to iterate on your pull request many
times before it is considered acceptable. Don't be discouraged by this
- instead, consider that to be a sign that the reviewers care highly
about the quality of the code base. At the same time though, consider
that it is frustrating for reviewers to have to say the same things over
and over again, so please do take care to provide as high-quality
submissions as possible, and see (5)!
7) You can review code too!
You don't have to be an official reviewer in order to review a pull
request. If you see a pull request dealing with an area you are
familiar with, feel free to examine it and comment as needed. In
addition, *all* pull requests need to be reviewed for basic
(non-machine-verifiable) correctness, including noticing bad code, NPE
risks, and anti-patterns as well as "boring stuff" like spelling and
grammar and documentation.
8) On major refactorings
When doing major and/or long-term refactors, while rare, it is possible
that the above constraints become impractical, especially with regard to
grouping changes. In this case, you can use a work branch on a (GitHub)
fork of WildFly, applying the above rules in micro-scale to just that
branch. In this case you could possibly ask a reviewer to also review
some or all of the pull requests to that branch. Merge commits would
then be used to periodically synchronize with upstream.
In this way, when the long-term branch is ready to "come home" to the
main branch, the reviewers may have a good idea that the (potentially
quite numerous) changes in the work branch have been reviewed already.
[1]
https://www.kernel.org/doc/Documentation/SubmittingPatches
[2]
https://developer.jboss.org/wiki/HackingOnWildFly
--
- DML
_______________________________________________
wildfly-dev mailing list
wildfly-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/wildfly-dev
--
- DML