[JBoss JIRA] (JGRP-2030) GMS: view_ack_collection_timeout delay when last 2 members leave concurrently
by Dan Berindei (JIRA)
Dan Berindei created JGRP-2030:
----------------------------------
Summary: GMS: view_ack_collection_timeout delay when last 2 members leave concurrently
Key: JGRP-2030
URL: https://issues.jboss.org/browse/JGRP-2030
Project: JGroups
Issue Type: Bug
Affects Versions: 3.6.8
Reporter: Dan Berindei
Assignee: Bela Ban
When the coordinator ({{NodeE}}) leaves, it tries to install a new view on behalf of the new coordinator ({{NodeG}}, the last member).
{noformat}
21:33:26,844 TRACE (ViewHandler,InitialClusterSizeTest-NodeE-42422:) [GMS] InitialClusterSizeTest-NodeE-42422: mcasting view [InitialClusterSizeTest-NodeG-30521|3] (1) [InitialClusterSizeTest-NodeG-30521] (1 mbrs)
21:33:26,844 TRACE (ViewHandler,InitialClusterSizeTest-NodeE-42422:) [TCP_NIO2] InitialClusterSizeTest-NodeE-42422: sending msg to null, src=InitialClusterSizeTest-NodeE-42422, headers are GMS: GmsHeader[VIEW], NAKACK2: [MSG, seqno=1], TP: [cluster_name=ISPN]
{noformat}
The message is actually sent later by the bundler, but {{NodeG}} is also sending its {{LEAVE_REQ}} message at the same time. Both nodes try to create a connection to each other, and only {{NodeG}} succeeds:
{noformat}
21:33:26,844 TRACE (ForkThread-2,InitialClusterSizeTest:) [TCP_NIO2] InitialClusterSizeTest-NodeG-30521: sending msg to InitialClusterSizeTest-NodeE-42422, src=InitialClusterSizeTest-NodeG-30521, headers are GMS: GmsHeader[LEAVE_REQ]: mbr=InitialClusterSizeTest-NodeG-30521, UNICAST3: DATA, seqno=1, conn_id=1, first, TP: [cluster_name=ISPN]
21:33:26,865 TRACE (Timer-2,InitialClusterSizeTest-NodeG-30521:) [TCP_NIO2] InitialClusterSizeTest-NodeG-30521: sending 1 msgs (83 bytes (0.27% of max_bundle_size) to 1 dests(s): [ISPN:InitialClusterSizeTest-NodeE-42422]
21:33:26,865 TRACE (Timer-2,InitialClusterSizeTest-NodeE-42422:) [TCP_NIO2] InitialClusterSizeTest-NodeE-42422: sending 1 msgs (91 bytes (0.29% of max_bundle_size) to 1 dests(s): [ISPN]
21:33:26,865 TRACE (Timer-2,InitialClusterSizeTest-NodeG-30521:) [TCP_NIO2] dest=127.0.0.1:7900 (86 bytes)
21:33:26,865 TRACE (Timer-2,InitialClusterSizeTest-NodeE-42422:) [TCP_NIO2] dest=127.0.0.1:7920 (94 bytes)
21:33:26,865 TRACE (Timer-2,InitialClusterSizeTest-NodeE-42422:) [TCP_NIO2] 127.0.0.1:7900: connecting to 127.0.0.1:7920
21:33:26,865 TRACE (Timer-2,InitialClusterSizeTest-NodeG-30521:) [TCP_NIO2] 127.0.0.1:7920: connecting to 127.0.0.1:7900
21:33:26,866 TRACE (NioConnection.Reader [null],InitialClusterSizeTest-NodeG-30521:) [TCP_NIO2] 127.0.0.1:7920: rejected connection from 127.0.0.1:7900 (connection existed and my address won as it's higher)
21:33:26,867 TRACE (OOB-1,InitialClusterSizeTest-NodeE-42422:) [TCP_NIO2] InitialClusterSizeTest-NodeE-42422: received [dst: InitialClusterSizeTest-NodeE-42422, src: InitialClusterSizeTest-NodeG-30521 (3 headers), size=0 bytes, flags=OOB], headers are GMS: GmsHeader[LEAVE_REQ]: mbr=InitialClusterSizeTest-NodeG-30521, UNICAST3: DATA, seqno=1, conn_id=1, first, TP: [cluster_name=ISPN]
{noformat}
I'm guessing {{NodeE}} would need a {{STABLE}} round in order to retransmit the {{VIEW}} message, but I'm not sure if the stable round would work, since it already (partially?) installed the new view with {{NodeG}} as the only member. However, I think it should be possible for {{NodeE}} to remove {{NodeG}} from it's {{AckCollector}} once it receives its {{LEAVE_REQ}}, and stop blocking.
This is a minor annoyance a few the Infinispan tests - most of them shut down the nodes serially, so they don't see this delay.
The question is whether the concurrent connection setup can have an impact for other messages as well - e.g. during startup, when there aren't a lot of messages being sent around to trigger retransmission. Could the node that failed to open its connection retry immediately on the connection opened by the other node?
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
10 years, 1 month
[JBoss JIRA] (WFCORE-1438) Elytron integration with logging subsystem
by Darran Lofthouse (JIRA)
Darran Lofthouse created WFCORE-1438:
----------------------------------------
Summary: Elytron integration with logging subsystem
Key: WFCORE-1438
URL: https://issues.jboss.org/browse/WFCORE-1438
Project: WildFly Core
Issue Type: Feature Request
Components: Logging
Reporter: Darran Lofthouse
Assignee: Darran Lofthouse
Fix For: 3.0.0.Alpha1
This is primarily an investigation task to start with, Logging can use syslog, Syslog access can be protected with TLS, Elytron is providing unified SSL configuration - do we need to bring these together.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
10 years, 1 month
[JBoss JIRA] (DROOLS-1095) improve error message when rule name is duplicated in spread sheet
by Michael Anstis (JIRA)
[ https://issues.jboss.org/browse/DROOLS-1095?page=com.atlassian.jira.plugi... ]
Michael Anstis moved GUVNOR-2468 to DROOLS-1095:
------------------------------------------------
Project: Drools (was: Guvnor)
Key: DROOLS-1095 (was: GUVNOR-2468)
Workflow: GIT Pull Request workflow (was: classic default workflow)
Affects Version/s: 6.4.0.CR1
(was: drools_6.3.0.Final)
Fix Version/s: (was: drools_6.4.0.Final)
> improve error message when rule name is duplicated in spread sheet
> ------------------------------------------------------------------
>
> Key: DROOLS-1095
> URL: https://issues.jboss.org/browse/DROOLS-1095
> Project: Drools
> Issue Type: Enhancement
> Affects Versions: 6.4.0.CR1
> Environment: BxMS 6.2.x
> Reporter: Hiroko Miura
> Attachments: DuplicateRuleNameError.png
>
>
> Customer has many decision tables(XLS) with many rules in their project.
> When uploading new/modified decision tables, sometimes conflict with rule name with existing one by mistake and then incremental build fails with the following error.
> 15:05:29,461 ERROR [org.drools.compiler.kie.builder.impl.AbstractKieModule] (EJB default - 5) Unable to build KieBaseModel:defaultKieBase
> [5,0]: Duplicate rule name: HelloWorld_11
> [15,0]: Duplicate rule name: HelloWorld_12
> According to message shown in business central, user can know which xls file has this duplicate rule name but hard to know the other file which conflicts with this file.
> Customer requested to improve this error message by reporting the file name which conflicts the rule name.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
10 years, 1 month
[JBoss JIRA] (WFLY-6389) Infinispan subsystem XSD does not match InfinispanSubsystemXMLReader
by Harald Wellmann (JIRA)
Harald Wellmann created WFLY-6389:
-------------------------------------
Summary: Infinispan subsystem XSD does not match InfinispanSubsystemXMLReader
Key: WFLY-6389
URL: https://issues.jboss.org/browse/WFLY-6389
Project: WildFly
Issue Type: Bug
Components: Clustering
Affects Versions: 10.0.0.Final
Reporter: Harald Wellmann
Assignee: Paul Ferraro
{{jboss-as-infinispan_4_0.xsd}} uses the same type {{thread-pool}} for all thread pool definitions.
However, {{InfinispanSubsystemXMLReader}} uses a special method {{parseScheduledThreadPool()}} for the {{EXPIRATION_THREAD_POOL}} which accepts fewer attributes than {{parseThreadPool()}}.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
10 years, 1 month
[JBoss JIRA] (WFLY-6376) Class loader leak due to Infinispan ExpirationManagerImpl
by Harald Wellmann (JIRA)
[ https://issues.jboss.org/browse/WFLY-6376?page=com.atlassian.jira.plugin.... ]
Harald Wellmann commented on WFLY-6376:
---------------------------------------
I tried this patch in a local build, and it didn't solve my problem, unfortunately.
I think there's just a small glitch: Don't you mean
{code}
executor.setMaximumPoolSize(maxThreads);
{code}
rather than
{code}
executor.setCorePoolSize(maxThreads);
{code}
> Class loader leak due to Infinispan ExpirationManagerImpl
> ---------------------------------------------------------
>
> Key: WFLY-6376
> URL: https://issues.jboss.org/browse/WFLY-6376
> Project: WildFly
> Issue Type: Bug
> Components: Clustering, JPA / Hibernate
> Affects Versions: 10.0.0.Final
> Reporter: Harald Wellmann
> Assignee: Paul Ferraro
> Priority: Critical
>
> h3. Scenario
> Given a WAR containing a persistence unit with second level cache and query cache enabled, I'm consistently hitting a Metaspace OutOfMemoryError after redeploying the unchanged application a couple of times.
> Analyzing the situation with Eclipse Memory Analyzer, I found one cause to be WFLY-6348, but even after applying that fix locally, I'm still having a Classloader leak which I can trace to a thread used by Infinispan referencing an obsolete web app classloader as context classloader.
> Just to rule out that this might be related to WFLY-6283, WFLY-6285, I repeated my experiments with a local build of WildFly master (2f11a59aee0dbdd52b65c5c684eafa83c3f418da), with Hibernate locally upgraded to 5.0.9.
> I'm still getting a classloader leak with that build.
> h3. Analysis
> {{org.infinispan.expiration.impl.ExpirationManagerImpl}} uses a {{LazyInitializingScheduledExecutorService}}. Due to lazy initialization, the {{ExecutorService}} and the underlying thread pool is not created until my web app is deployed. Thus, when the {{ExecutorService}} is created, the context class loader is set to the web app class loader, and this appears to propagate to the threads of the executor thread pool.
> When the application is undeployed, {{ExpirationManagerImpl.stop()}} gets invoked to cancel any running expiration task. However, the {{ExecutorService}} is not shut down, the threads remain alive and still keep a reference to the now obsolete context classloader.
> h3. Remarks
> I'm not sure if this analysis is correct, at least I hope there's a clue for WildFly and Infinispan experts to identify the real cause.
> By the way, it would be helpful if all threads created by Infinispan had meaningful names, rather than default names like {{pool-5-thread-1}}.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
10 years, 1 month