On Fri, Feb 23, 2018 at 12:02 PM, Alexey Loubyansky <alexey.loubyansky@redhat.com> wrote:
On Fri, Feb 23, 2018 at 12:11 AM, Brian Stansberry <brian.stansberry@redhat.com> wrote:
Having maven GAVs be an internal detail of the tool sounds fine, but we are going to need to produce and distribute the feature packs, and for that I figured we're talking maven. With a specialized plugin involved, sure, but for now and probably for quite a while, it's fundamentally maven.

By distributing you mean deploying them to the repo?

Sorry for the delay on this. I mean building them making them available for use, in whatever ways we have to do that. Precisely how we intend to do that was something of a question mark for me, even before this discussion. But in a naive kind of way if we were just talking about building maven artifacts and making them available via a maven repo, well that's something we've done a ton of and it's well understood. But we (or at least I) need more clarity on how this will work, and this discussion has just made me more aware of that. 

Within the WildFly build itself, AIUI then this "provisioning repo" is both an output, and an input. It's an input because the existing build and dist maven modules need to continue to exist, and those will need this provisioning repo in order for the pm tool to produce the build/dist artifacts.

I agree that this "provisioning repo" does not need to be internally structured as a maven repo. It just needs to be producible and consumable by a maven-based build that uses a plugin that uses the provisioning tool.

Let's clarify who will care about the actual GAVs. Will feature-packs need to be located by anything else than the provisioning tool? People taking a snapshot of the repo for offline use?

I don't think so no.
 
Once feature-packs are in the repo, they become consumable by the tool (which is capable of discovering them by means of a resolver). The tool can also create feature-packs and install/deploy them into the repo. So it serves both the end users and teams producing the feature-packs. The location in the repo will still be 100% predictable. It's just the coordinates in the provisioning configs will not be the actual Maven GAVs. I'm thinking who would care about that. The end user will deal with the notions of the family, branch, stream, etc and not need to set the coordinates resolver up. It will be provided by the stream they subscribe to.

BTW, conceptually the artifact resolver component will be there either way just be able to implement the notion of the universe and a stream of updates.

Alexey
 

One thing I didn't say before because I was focused on my question, is that the expression segments you outlined sound conceptually correct to me. Because they sound right is why I jumped to practical questions. I don't want to sidetrack this too much though with the details of how this relates to maven, at least not at the cost of people giving you feedback on the basic concept.


On Thu, Feb 22, 2018 at 4:41 PM, Alexey Loubyansky <alexey.loubyansky@redhat.com> wrote:
On Thu, Feb 22, 2018 at 10:24 PM, Brian Stansberry <brian.stansberry@redhat.com> wrote:
I'm describing my thinking process of understanding this in hopes that it's helpful to others. Or that I'm all wrong and you can correct me. ;)

AIUI you want to still want to use maven and GAVs for actually pulling items from the repo, but the additional stream info allows you to work out how to identify other related items.  So I'm a bit unclear on the relationships of this coordinate to a GAV.

GAV has been used initially because of the Maven repo. As long as we use Maven whatever coordinate expression we choose it will have to translate to GAV at the end. I imagine there will be an artifact (target repo coordinate) resolver that will take care of that.

I initially thought it's

universe:family:build-id

org.jboss:wildfly:12.0.5.Beta4

That would mean though that BUILD_ID is not just unique for the branch, it is unique for the family.  That sounds wrong, as you state it's unique to the branch.

So now I think it's

family:branch:build-id

wildfly:12:12.0.5.Beta4

To me that looks like a variation of a GAV which is both a coordinate and an ID. That could be ok. Actually, the examples above do contain a lot of info that seems sufficient to have a clue about what this is and where it belongs. My approach was based on what pieces of info I wanted to extract from those expressions and that would include (taking into account the tooling and the user interface): universe, family, branch, release stream classifier, release id. This is what I will be extracting and dealing with whatever format we choose. So I might as well expose these directly and let project/product owners decide how those map into their preferred versioning, compatibility and update rules. I could provide a default GAV coordinate resolver based on how we are used to define our GAVs and also let the user (project owner) provide a custom one.
 
One concern with that is the 'A' in the GAV is no longer something rarely changing. In the WildFly case it would change every 3 months. This has some implications for the process of producing the feature packs.  I'm not saying that's a show-stopper problem; more that it's something that we'll have to be aware of as we think through the process of creating these.

One of the advantages of not using actual Maven GAVs directly is to make them an implementation detail. If one day we decide to redefine our GAV approach or support non-Maven repo for some reason, the end user of the tool will not have to know about that.


Thanks,
Alexey

Most readers can safely skip the rest of this as I'm probably getting ahead of myself....

An example of the kind of thing I'm talking about is in the current root pom for WildFly we have:

<project>
   ....
    <dependencyManagement>
        <dependencies>
            ....
            <dependency>
                <groupId>${project.groupId}</groupId>
                <artifactId>wildfly-feature-pack</artifactId>
                <type>pom</type>
                <version>${project.version}</version>
            </dependency>

Thereafter any other child poms that declare a dependency on that feature pack just have

<project>
   ....
    <dependencies>
        ....
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>wildfly-feature-pack</artifactId>
            <type>pom</type>
        </dependency>

There's no need to specify the version all over the place, as the dependencyManagement mechanism takes care of that in a central location.  But that kind of approach doesn't work as readily when it comes to artifactId.

One possibility is in the root pom there's

<project>
    ....
    <properties>
        <feature.pack.branch>12</feature.pack.branch>
    ....
    <dependencyManagement>
        <dependencies>
            ....
            <dependency>
                <groupId>${project.groupId}</groupId>
                <artifactId>${feature.pack.branch}</artifactId>
                <version>${project.version}</version>
            </dependency>

And then in other child poms:

<project>
   ....
    <dependencies>
        ....
        <dependency>
            <groupId>${project.groupId}</groupId>
            <artifactId>${feature.pack.branch}</artifactId>
            <type>pom</type>
        </dependency>

On Wed, Feb 21, 2018 at 4:40 PM, Alexey Loubyansky <alexey.loubyansky@redhat.com> wrote:
As many of you know we are planning to move to the new feature-packs and the provisioning mechanism for our wildfly(-based) distributions. New feature-packs will be artifacts in a repository (currently Maven). In this email I'd like to raise a question about how to express a location (coordinates) of a feature-pack, its identify (id) and a stream information which is the source of version updates and patches.

Until this moment I've used the GAV (group, artifact, version) as both the feature-pack ID and its coordinates in the repository. This is pretty much enough for a static installation config (which is a list of feature-pack GAVs and config options). The GAV-based config also makes the installation build reproducible. Which is a hard requirement for the provisioning mechanism.

On the other hand, we also want to be able to check for the updates in the repository for the installed feature-packs and apply them to an existing installation. Which means that the installation has to be also described in terms of the consumed update streams. This will be a description of the installation in terms of sources of the latest available versions. A build from this kind of config is not guaranteed to be reproducible. This is where the GAVs don't fit as well.

What I would like to achieve is to combine the static and dynamic parts of the config into one. Here is what I'm considering. When I install a feature-pack (using a tool or adding it manually into the installation config) what ends up in the config is the following expression: universe:family:branch:classifier:build_id, e.g. org.jboss:wildfly:12:beta:12.0.5.Beta4. This expression is going to be the feature-pack coordinates.

The meaning behind the parts.

UNIVERSE

Universe is supposed to be a registry of feature-pack streams for various projects and products. In the example above the org.jboss universe would include wildfly-core, wildfly and related projects that are consumed by wildfly that also choose to provide feature-packs.

FAMILY

The family part would designate the project or product.

BRANCH

The branch would normally be a major version. The assumption is that anything that comes from the branch is API and config backward compatible.

CLASSIFIER

Branch + classifier is what identifies a stream. The idea is that there could be multiple streams originating from the same branch. I.e. a stream of final releases, a stream of betas, alphas, etc. A user could choose which stream to subscribe to by providing the classifier.

BUILD ID

In most cases that would be the release version. universe:family:branch:build_id is going to be the feature-pack identity. The classifier is not taken into account because the same feature-pack build/release might appear in more than one stream. And so the build_id must be unique for the branch.


Given the full feature-pack coordinates, the target feature-pack can unmistakenly be identified and the installation can be reproduced. At the same time, the coordinates include the stream information, so a tool can check the stream for the updates, apply them and update the installation config with the new feature-pack build_id.

If you see any problem with this approach or have a better idea, please share. Thanks!

Alexey

_______________________________________________
wildfly-dev mailing list
wildfly-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/wildfly-dev



--
Brian Stansberry
Manager, Senior Principal Software Engineer
Red Hat




--
Brian Stansberry
Manager, Senior Principal Software Engineer
Red Hat




--
Brian Stansberry
Manager, Senior Principal Software Engineer
Red Hat