Let's keep in mind that the current way CI works is the only way we can
enable starting nodes on-demand. Any solution involving the "heavy job"
plugin like we used to will force us to have a static number of nodes, and
leave them up even when unused for extended periods of time, leading us to
either long waiting queues or higher costs. That's because the "heavy job"
plugin does not work well with the AWS plugin.
So it might have drawbacks, but it's not like other solutions are great
either.
I assume your problem was that all the instances were busy when you started
your release, so you had to wait for one build to be finished before your
release starts. As a temporary workaround, if it happens again before we
addressed the problem, you can simply start more instances manually by
going to
http://ci.hibernate.org/computer/ and using the "Provision via
AWS" dropdown. This should work even if the instance cap has been reached,
and if your release job is already pending it will be the first to be
executed using this new instance.
Regarding your suggestions, I also think fixing the GitHub plugin would be
the way to go, since it would benefit to us beyond releases. We could first
try sending a pull request and asking for a release. If they don't react,
we'll have to ask ourselves whether we want this fix badly enough to
maintain a fork...
This does not exclude your other suggestion, though. I've had a look, and
it seems that it's not easy for several reasons:
1. The AWS plugin does not work well when you use the same VM image
("AMI") in multiple "slave templates", so we would have to create
multiple
AMIs with different IDs but the same content. Annoying, but manageable, I
suppose.
2. The AWS plugin always picks a single "slave template", the first one
matching the required labels, whenever it picks the right slave template
for a given job [2]. Thus, if we create an additional "instance
configuration" for release jobs, we will have to put it in first position
and the plugin will always execute release jobs on that configuration. We
will never be able to opportunistically use existing, idle instances of the
"default" slave template for release jobs.
In short, we'll probably have to spin up a node every time we do a
release.
I'd say both problems qualify as a bug. I could have a look and submit a
pull request, but I guess the 3 minutes it takes to spin up a node won't
satisfy you, even if it's not for every release so I'm not sure it's worth
the effort?
We can apply some workarounds in the meantime. We could in particular add
throttle the jobs we know should very rarely be triggered. The setting is
located in "Job Notifications > Throttle builds". Setting this to 1 build
per day should mitigate the problem, and it's not as bad as disabling the
job entirely.
[1] The plugin assumes that AMI IDs are unique identifiers of "Slave
templates" in various places, such as
https://github.com/jenkinsci/ec2-plugin/blob/affb7f407cd024accdf4e9093b07...
[2]
https://github.com/jenkinsci/ec2-plugin/blob/affb7f407cd024accdf4e9093b07...
On Mon, 30 Apr 2018 at 20:54 Davide D'Alto <daltodavide(a)gmail.com> wrote:
Using docker might be a nice idea if the machines are powerful
enough.
I will just mention it here but for the release only we can also not
use Jenkins and run the command
we need from the terminal of
ci.hibernate.org. We already have the
scripts ready so it shouldn't be too hard.
If the Jenkins plugin doesn't work the way we need I don't feel like
creating our own branch and
I will consider it only if it's about sending a PR somewhere.
But all this won't solve the problem with SourceForge that seems to be
the main reason
we see failures lately.
On Mon, Apr 30, 2018 at 7:42 PM, Guillaume Smet
<guillaume.smet(a)gmail.com> wrote:
> On Mon, Apr 30, 2018 at 8:34 PM, Sanne Grinovero <sanne(a)hibernate.org>
> wrote:
>
>> Starting a new slave only takes 3 minutes, but I believe it has to be
>> a "manual start" from its admin dashboard as Jenkins's scaling
plugin
>> is limited.
>>
>> Fixing the Jenkins triggers would be my preference.
>>
>
> Yeah, last time we discussed this on HipChat, I have all the useful
> pointers. The code changes to the official plugin would be minimal. The
> only thing I'm worried about is how we would maintain this plugin.
>
> Alternatively:
>> - we could look at pipelines
>>
>
> How would they solve the issue?
>
>
>> - run all jobs within Docker -> improved isolation would allow us to
>> run N builds per machine
>>
>
> Would that help? I.e. are the machines we start powerful enough to run
> several jobs in parallel?
>
> I suspect, it wouldn't change the issue, just change the numbers of
servers
> we would need (which might be good anyway but is not related to the issue
> at hand).
>
> --
> Guillaume
> _______________________________________________
> hibernate-dev mailing list
> hibernate-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/hibernate-dev
_______________________________________________
hibernate-dev mailing list
hibernate-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/hibernate-dev
--
Yoann Rodiere
yoann(a)hibernate.org / yrodiere(a)redhat.com
Software Engineer
Hibernate NoORM team