[hibernate-dev] ci.hibernate.org : announcing distributed cache for maven artifacts

Yoann Rodiere yoann at hibernate.org
Wed Jan 17 03:34:05 EST 2018


Thanks for trying :) We might try to improve the Maven plugins in question,
but as you said we already spent quite some time on infrastructure...

That being said, if we are ever to follow-up on this caching idea, Gunnar's
idea of a local Nexus got me thinking... Was your change only about
performance, or is it also about the bill? From what I understood we pay
for outbound network traffic, so those recurring downloads might be a
problem...
To reduce the bill, what if we simply added a AWS node running a Nexus
repository or a HTTP cache acting as proxy to the various Maven
repositories we use?
If we pay for outbound network traffic but not for network traffic between
our own nodes, that would be a solution. Not sure if we would gain much in
performance, since network speed would probably be similar, but that might
reduce the bill (depending on the cost of this additional node, of course).

On Tue, 16 Jan 2018 at 23:24 Sanne Grinovero <sanne at hibernate.org> wrote:

> Version F27v17 of the slaves is running now, with NFS drive removed.
>
> sorry for the experiment :)
>
> Thanks
> Sanne
>
> On 16 January 2018 at 21:51, Sanne Grinovero <sanne at hibernate.org> wrote:
> > On 16 January 2018 at 21:33, Steve Ebersole <steve at hibernate.org> wrote:
> >> well Gradle is used in CI environments all over the place, so it must
> work.
> >> But I think we need some different configurations in the Gradle command
> >> used.  For example, it is highly suggested that the Gradle daemon be
> >> disabled in CI but I'm not sure all of our jobs actually do that.  I'll
> look
> >> into that...
> >
> > I wouldn't mind having the Gradle deamon always on, if it helps we
> > could even pre-load it with some tuned configuration.
> > The only drawback I see is to make it easy to upgrade Gradle version,
> > in case one needs, without having to go through server configuration
> > scripts.
> >
> > We need strict isolation about writes in the cache though; for now
> > I'll disable it, not least for the concerns that Yoann and Gunnar
> > pointed out, then we can experiment with cool ideas more carefully.
> >
> > Funny, one would expect to know by know about the perils of a
> > distributed cache :)
> >
> >
> >>
> >> On Tue, Jan 16, 2018 at 3:30 PM Sanne Grinovero <sanne at hibernate.org>
> wrote:
> >>>
> >>> Yes I did it for Gradle too, sorry. The `/efs-maven-artifacts` is the
> >>> guilty mount point.
> >>>
> >>> I don't know any quick solutions for the various concerns you all
> >>> raised, so I'll roll this back tonight.
> >>>
> >>> It's good to know that it's not too hard to have a shared FS between
> >>> these machines; needs better planning though.
> >>>
> >>> Thanks,
> >>> Sanne
> >>>
> >>> On 16 January 2018 at 19:41, Steve Ebersole <steve at hibernate.org>
> wrote:
> >>> > Did you happen to do the same for Gradle caches?
> >>> >
> >>> > Some jobs are failing:
> >>> >
> >>> >
> >>> > * What went wrong:
> >>> > Could not resolve all dependencies for configuration
> >>> > ':buildSrc:runtimeClasspath'.
> >>> >> Timeout waiting to lock artifact cache
> >>> >> (/efs-maven-artifacts/.gradle/caches/modules-2). It is currently in
> use
> >>> >> by
> >>> >> another Gradle instance.
> >>> >   Owner PID: 1423
> >>> >   Our PID: 10249
> >>> >   Owner Operation: resolve configuration ':classpath'
> >>> >   Our operation:
> >>> >   Lock file:
> >>> > /efs-maven-artifacts/.gradle/caches/modules-2/modules-2.lock
> >>> >
> >>> >
> >>> >
> >>> > On Mon, Jan 15, 2018 at 5:06 AM Yoann Rodiere <yoann at hibernate.org>
> >>> > wrote:
> >>> >>
> >>> >> > We should reconfigure those to not "install" - that's actually a
> bad
> >>> >> > habit, legacy from Maven 2 times - people nowadays recommend using
> >>> >> > "mvn clean verify", especially on CI environments.
> >>> >>
> >>> >> I could not agree more, that would be cleaner, but that's not
> possible.
> >>> >> And
> >>> >> believe me, I tried hard. Last time I checked, some of the plugins
> we
> >>> >> use
> >>> >> with dynamic dependency resolution would ignore the artifacts being
> >>> >> built,
> >>> >> and would always fetch the artifacts from the Maven repos (for
> >>> >> SNAPSHOTs,
> >>> >> they would end up using nightlies).
> >>> >> I'm not talking about when we use standard maven markup to declare
> >>> >> dependencies, but when the plugin itself has to fetch dependencies
> >>> >> "dynamically", which happens when we setup a WildFly server with our
> >>> >> own
> >>> >> modules in particular. See maven-dependency-plugin's "artifactItems"
> >>> >> configuration.
> >>> >>
> >>> >>
> >>> >>
> >>> >> On Mon, 15 Jan 2018 at 11:29 Sanne Grinovero <sanne at hibernate.org>
> >>> >> wrote:
> >>> >>
> >>> >> > On 15 January 2018 at 08:42, Yoann Rodiere <yoann at hibernate.org>
> >>> >> > wrote:
> >>> >> > > Thanks Sanne !
> >>> >> > >
> >>> >> > > I have one question...
> >>> >> > >
> >>> >> > >> Please never rely on this as "storage": it's just meant as
> cache
> >>> >> > >> and
> >>> >> > >> we reserve the right to wipe it all out at any time.
> >>> >> > >
> >>> >> > > I gather you say that so that we don't try to "release"
> artifacts
> >>> >> > > into
> >>> >> > this
> >>> >> > > cache? But temporary storage for the duration of one build will
> >>> >> > > still
> >>> >> > > be
> >>> >> > > safe?
> >>> >> > >
> >>> >> > > Because our builds obviously rely on the local repository for
> >>> >> > > short-term
> >>> >> > > storage (for the duration of the build). For example the
> >>> >> > > dependencies
> >>> >> > > are
> >>> >> > > only checked and downloaded if necessary at the beginning of the
> >>> >> > > build,
> >>> >> > and
> >>> >> > > then are expected to exist in the local repository until the
> build
> >>> >> > > stops.
> >>> >> > > Another example: our WildFly modules are first built and
> installed
> >>> >> > > in
> >>> >> > > the
> >>> >> > > "modules" subproject, and later "fetched" from the local
> repository
> >>> >> > > in
> >>> >> > the
> >>> >> > > "integrationtest/wildfly" subproject.
> >>> >> > >
> >>> >> > > If we were to clear the cache during a build, things would
> probably
> >>> >> > > go
> >>> >> > > wrong. Worse, if two parallel builds were to install the same
> >>> >> > > artifacts
> >>> >> > > (e.g. hibernate-search-engine version 5.9.0-SNAPSHOT), we would
> run
> >>> >> > > the
> >>> >> > risk
> >>> >> > > of testing the wrong "version" of this artifact in one of the
> >>> >> > > builds...
> >>> >> >
> >>> >> > SNAPSHOT being installed are indeed a problem, e.g the PR testing
> >>> >> > jobs
> >>> >> > could conflict with the regular master jobs.
> >>> >> > We should reconfigure those to not "install" - that's actually a
> bad
> >>> >> > habit, legacy from Maven 2 times - people nowadays recommend using
> >>> >> > "mvn clean verify", especially on CI environments.
> >>> >> >
> >>> >> > I agree about the perils of clearing the cache during in-progress
> >>> >> > builds
> >>> >> > too.
> >>> >> >
> >>> >> > I just meant to warn that we don't have any backup plan in place,
> and
> >>> >> > I do plan to just wipe the whole thing occasionally:
> >>> >> >  - when we have any direct need, e.g. currupted downloads
> >>> >> >  - when it gets too large
> >>> >> >  - if it gets too expensive
> >>> >> >  - regularly, just to "practice" that everything works with an
> empty
> >>> >> > cache
> >>> >> >
> >>> >> > Also our "disaster recovery" plan to rebuild all infrastructure
> will
> >>> >> > always assume it's ok to reboot with having this file system
> empty.
> >>> >> >
> >>> >> > Thanks,
> >>> >> > Sanne
> >>> >> >
> >>> >> > >
> >>> >> > >
> >>> >> > > On Sun, 14 Jan 2018 at 01:18 Sanne Grinovero <
> sanne at hibernate.org>
> >>> >> > wrote:
> >>> >> > >>
> >>> >> > >> Hi all,
> >>> >> > >>
> >>> >> > >> while the new build machines are fast, some of you pointed out
> >>> >> > >> we're
> >>> >> > >> now spending a relative high amount of time downloading maven
> >>> >> > >> dependencies, this problem being compounded by the fact we
> "nuke"
> >>> >> > >> idle
> >>> >> > >> slaves shortly after they become idle.
> >>> >> > >>
> >>> >> > >> I just spent the day testing a distributed file system, and
> it's
> >>> >> > >> now
> >>> >> > >> running in "production".
> >>> >> > >> It's used exclusively to store the Gradle and Maven caches.
> This
> >>> >> > >> is
> >>> >> > >> stateful and independent from the lifecycle of individual slave
> >>> >> > >> nodes.
> >>> >> > >>
> >>> >> > >> Unfortunately this solution is not viable for Docker images, so
> >>> >> > >> while
> >>> >> > >> I experimented with the idea I backed off from moving the
> docker
> >>> >> > >> storage graph to a similar device. Please don't waste time
> trying
> >>> >> > >> that
> >>> >> > >> w/o carefully reading the Docker documentation or talking with
> me
> >>> >> > >> :)
> >>> >> > >> Also, beyond correctness of storage semantics, it's likely far
> >>> >> > >> less
> >>> >> > >> efficient for Docker.
> >>> >> > >>
> >>> >> > >> To learn more about our new cache:
> >>> >> > >>  -
> >>> >> > >>
> >>> >> >
> >>> >> >
> >>> >> >
> https://github.com/hibernate/ci.hibernate.org/commit/dc6e0a4bd09fb3ae6347081243b4fb796a219f90
> >>> >> > >>  - https://docs.aws.amazon.com/efs/latest/ug/how-it-works.html
> >>> >> > >>
> >>> >> > >> I'd add that - because of other IO tuning in place - writes
> might
> >>> >> > >> appear out of order to other nodes, and conflicts are not
> handled.
> >>> >> > >> Shouldn't be a problem since snapshots now have timestamps, but
> >>> >> > >> this
> >>> >> > >> might be something to keep in mind.
> >>> >> > >>
> >>> >> > >> N.B.
> >>> >> > >> Please never rely on this as "storage": it's just meant as
> cache
> >>> >> > >> and
> >>> >> > >> we reserve the right to wipe it all out at any time.
> >>> >> > >>
> >>> >> > >> Thanks,
> >>> >> > >> Sanne
> >>> >> > >> _______________________________________________
> >>> >> > >> hibernate-dev mailing list
> >>> >> > >> hibernate-dev at lists.jboss.org
> >>> >> > >> https://lists.jboss.org/mailman/listinfo/hibernate-dev
> >>> >> > >
> >>> >> > >
> >>> >> > >
> >>> >> > > --
> >>> >> > > Yoann Rodiere
> >>> >> > > yoann at hibernate.org / yrodiere at redhat.com
> >>> >> > > Software Engineer
> >>> >> > > Hibernate NoORM team
> >>> >> >
> >>> >>
> >>> >>
> >>> >> --
> >>> >> Yoann Rodiere
> >>> >> yoann at hibernate.org / yrodiere at redhat.com
> >>> >> Software Engineer
> >>> >> Hibernate NoORM team
> >>> >> _______________________________________________
> >>> >> hibernate-dev mailing list
> >>> >> hibernate-dev at lists.jboss.org
> >>> >> https://lists.jboss.org/mailman/listinfo/hibernate-dev
>


-- 
Yoann Rodiere
yoann at hibernate.org / yrodiere at redhat.com
Software Engineer
Hibernate NoORM team


More information about the hibernate-dev mailing list