[JBoss JIRA] (JBJCA-1341) Account for additional DB2 FATAL connection errors
by Stefano Maestri (JIRA)
[ https://issues.jboss.org/browse/JBJCA-1341?page=com.atlassian.jira.plugin... ]
Stefano Maestri reassigned JBJCA-1341:
--------------------------------------
Assignee: Ingo Weiss
> Account for additional DB2 FATAL connection errors
> --------------------------------------------------
>
> Key: JBJCA-1341
> URL: https://issues.jboss.org/browse/JBJCA-1341
> Project: IronJacamar
> Issue Type: Enhancement
> Components: Validator
> Reporter: Ingo Weiss
> Assignee: Ingo Weiss
> Original Estimate: 2 days
> Time Spent: 2 days
> Remaining Estimate: 0 minutes
>
> Various version of pre 11.x DB2 drivers utilize the -99999 error code for a SQLException. Not all -99999 errors are fatal. For those variations that are known to be fatal, a check should be added to treat as such.
> One example would be the -99999 error that indicates "Connection is closed"
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (WFLY-7810) Artemis hangs during failback in remote JCA scenario
by Jeff Mesnil (JIRA)
[ https://issues.jboss.org/browse/WFLY-7810?page=com.atlassian.jira.plugin.... ]
Jeff Mesnil reopened WFLY-7810:
-------------------------------
Reopening this issue as the fix to WFLY-8246 reintroduced the call to `remotingService.isPaused()` that was causing the 10s wait for each connection
> Artemis hangs during failback in remote JCA scenario
> ----------------------------------------------------
>
> Key: WFLY-7810
> URL: https://issues.jboss.org/browse/WFLY-7810
> Project: WildFly
> Issue Type: Bug
> Components: JMS
> Reporter: Jeff Mesnil
> Assignee: Jeff Mesnil
> Priority: Critical
> Fix For: 11.0.0.Alpha1
>
>
> Remote JCA scenario:
> * There are 3 nodes
> * Node 1 and node 2 are Live-Backup pair (replicated HA)
> * Node 3 has MDB which remotely connects to node 1 and is able to do failover on node 2
> * During the test, node 1 is killed and started again
> Problem occurs when node 1 is started again. Servers are configured to do failback. When node 1 wants to become live again, something goes wrong with connection between node 1 and node 2. On node 1 I can see repeated WARN message \[1\]. Node 2 prints repeatedly WARN message \[2\].
> I can see the same issue also with 7.0.x. We haven't notice this error because the test didn't check state of servers after the failback.
> When I modify the test to not deploy MDB on node 3, the test passes without any unusual error. It seems the issue is related to this scenario.
> \[1\]
> {code}
> 09:59:09,197 WARN [org.apache.activemq.artemis.core.server] (Thread-0 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$2@26357508-1826618556)) AMQ222137: Unable to announce backup, retrying: ActiveMQConnec
> tionTimedOutException[errorType=CONNECTION_TIMEDOUT message=AMQ119012: Timed out waiting to receive initial broadcast from cluster]
> at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:747) [artemis-core-client-1.5.0.redhat-1.jar:1.5.0.redhat-1]
> at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:625) [artemis-core-client-1.5.0.redhat-1.jar:1.5.0.redhat-1]
> at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:607) [artemis-core-client-1.5.0.redhat-1.jar:1.5.0.redhat-1]
> at org.apache.activemq.artemis.core.server.cluster.BackupManager$BackupConnector$1.run(BackupManager.java:246) [artemis-server-1.5.0.redhat-1.jar:1.5.0.redhat-1]
> at org.apache.activemq.artemis.utils.OrderedExecutorFactory$OrderedExecutor$ExecutorTask.run(OrderedExecutorFactory.java:101) [artemis-commons-1.5.0.redhat-1.jar:1.5.0.redhat-1]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_111]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_111]
> at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_111]
> {code}
> \[2\]
> {code}
> 10:00:19,245 WARN [org.apache.activemq.artemis.core.client] (Thread-135) AMQ212042: Timed out waiting for packet to be flushed
> 10:00:29,245 WARN [org.apache.activemq.artemis.core.client] (Thread-135) AMQ212042: Timed out waiting for packet to be flushed
> 10:00:39,245 WARN [org.apache.activemq.artemis.core.client] (Thread-135) AMQ212042: Timed out waiting for packet to be flushed
> 10:00:49,246 WARN [org.apache.activemq.artemis.core.client] (Thread-135) AMQ212042: Timed out waiting for packet to be flushed
> 10:00:59,247 WARN [org.apache.activemq.artemis.core.client] (Thread-135) AMQ212042: Timed out waiting for packet to be flushed
> 10:01:09,247 WARN [org.apache.activemq.artemis.core.client] (Thread-135) AMQ212042: Timed out waiting for packet to be flushed
> 10:01:19,248 WARN [org.apache.activemq.artemis.core.client] (Thread-135) AMQ212042: Timed out waiting for packet to be flushed
> 10:01:29,248 WARN [org.apache.activemq.artemis.core.client] (Thread-135) AMQ212042: Timed out waiting for packet to be flushed
> 10:01:39,249 WARN [org.apache.activemq.artemis.core.client] (Thread-135) AMQ212042: Timed out waiting for packet to be flushed
> 10:01:49,249 WARN [org.apache.activemq.artemis.core.client] (Thread-135) AMQ212042: Timed out waiting for packet to be flushed
> 10:01:59,250 WARN [org.apache.activemq.artemis.core.client] (Thread-135) AMQ212042: Timed out waiting for packet to be flushed
> 10:02:09,250 WARN [org.apache.activemq.artemis.core.client] (Thread-135) AMQ212042: Timed out waiting for packet to be flushed
> {code}
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (JGRP-2162) Failed to send broadcast when opening the connection
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/JGRP-2162?page=com.atlassian.jira.plugin.... ]
Bela Ban updated JGRP-2162:
---------------------------
Fix Version/s: 4.0.2
> Failed to send broadcast when opening the connection
> ----------------------------------------------------
>
> Key: JGRP-2162
> URL: https://issues.jboss.org/browse/JGRP-2162
> Project: JGroups
> Issue Type: Bug
> Reporter: Radim Vansa
> Assignee: Bela Ban
> Fix For: 4.0.2
>
> Attachments: TcpNio2McastTest.java
>
>
> IRC discussion:
> {quote}
> bela_: Hi Bela, I have a weird failure in one test that seem to be rooted in JGroups. TCP_NIO2 is in charge, and there's a broadcast message to all nodes, but it seems it's not received on the other side.
> <bela_> rvansa: reproducible?
> <rvansa> bela_: it happens when the connection to a node is just being opened: I have added some trace logs and just a moment before writing to the NioConnection.send_buf it was in state "connection pending"
> <rvansa> bela_: sort of, after tens of runs of that test (on my machine) - and I've seen it first time in CI, so it could be
> <bela_> rvansa: NioConnection buffers writes up to a certain extent, then discards anything over the buffer limit
> <bela_> rvansa: max_send_buffers (default: 10). But retransmission should fix this, unless you don’t wait long enough
> <rvansa> bela_: I don't think it should go over the limit
> <rvansa> bela_: the test is not doing anything else, just sending CommitCommand (that should be couple hundred bytes at most) and then waiting
> <rvansa> bela_: according to the traces I've added, Buffers.write returned false when writing the local address, and then true when writing the actual message
> {quote}
> I have been trying to write a reproducer, and found that it's related to the fact that the failing test uses custom (fake) discovery protocol, that doesn't open the connection during startup. In my ~reproducer I had to modify tcp-nio.xml to use TCPPING with only the first node in hosts list (localhost[7800]):
> {code:xml}
> <TCPPING async_discovery="true" initial_hosts="${jgroups.tcpping.initial_hosts:localhost[7800]}" port_range="0"/>
> {code}
> This causes that the physical connection is not opened by discovery. However, the reproducer suffers from (always reproducible) flaw - it does not send the message to third node at all (and the test fails, therefore).
> Note that increasing the timeout in request options does not help.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (DROOLS-1472) NPE in stateful session
by Mario Fusco (JIRA)
[ https://issues.jboss.org/browse/DROOLS-1472?page=com.atlassian.jira.plugi... ]
Mario Fusco commented on DROOLS-1472:
-------------------------------------
Those {code}stagedLeftTuples{code} are supposed to be used only at the border of 2 segments of the phreak network and then only if the node is the tip of the segment. In all other case that object shouldn't be dereferenced and when this happens we want to fail fast with a NPE instead of "hiding the dust under the carpet" with a null check and propagate later an error that will be then much harder to track. Now the question is: why in your case drools is trying to use that object even when it shouldn't? This is clearly a bug, but as I said it's quite impossible for me to investigate it without a chance to reproduce it.
> NPE in stateful session
> ------------------------
>
> Key: DROOLS-1472
> URL: https://issues.jboss.org/browse/DROOLS-1472
> Project: Drools
> Issue Type: Bug
> Components: core engine
> Affects Versions: 6.5.0.Final
> Environment: RedHat 6.2, Java 8.102
> Reporter: Michael Neifeld
> Assignee: Mario Fusco
> Priority: Critical
>
> Found NPE in a log that probably leads to session destroying.
> CEP works in multithreaded environment and there are almost always 16 drools-workers thread.
> 2017-03-05 16:30:58 com.mot.ssol.cep.workflow.CEPSession [ERROR] Session execution error occurred
> java.lang.NullPointerException: null
> at org.drools.core.phreak.RuleNetworkEvaluator.deleteChildLeftTuple(RuleNetworkEvaluator.java:729) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.phreak.RuleNetworkEvaluator.unlinkAndDeleteChildLeftTuple(RuleNetworkEvaluator.java:721) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.phreak.PhreakNotNode.doRightUpdates(PhreakNotNode.java:343) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.phreak.PhreakNotNode.doNode(PhreakNotNode.java:74) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.phreak.RuleNetworkEvaluator.switchOnDoBetaNode(RuleNetworkEvaluator.java:524) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.phreak.RuleNetworkEvaluator.evalBetaNode(RuleNetworkEvaluator.java:505) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.phreak.RuleNetworkEvaluator.evalNode(RuleNetworkEvaluator.java:341) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.phreak.RuleNetworkEvaluator.innerEval(RuleNetworkEvaluator.java:301) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.phreak.RuleNetworkEvaluator.outerEval(RuleNetworkEvaluator.java:136) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.phreak.AddRemoveRule.forceFlushLeftTuple(AddRemoveRule.java:686) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.phreak.AddRemoveRule.flushLeftTupleIfNecessary(AddRemoveRule.java:629) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.reteoo.LeftInputAdapterNode.doInsertSegmentMemory(LeftInputAdapterNode.java:225) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.reteoo.LeftInputAdapterNode.doInsertObject(LeftInputAdapterNode.java:210) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.reteoo.LeftInputAdapterNode.assertObject(LeftInputAdapterNode.java:169) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.reteoo.SingleObjectSinkAdapter.propagateAssertObject(SingleObjectSinkAdapter.java:63) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.reteoo.AlphaNode.assertObject(AlphaNode.java:134) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.reteoo.CompositeObjectSinkAdapter.doPropagateAssertObject(CompositeObjectSinkAdapter.java:494) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.reteoo.CompositeObjectSinkAdapter.propagateAssertObject(CompositeObjectSinkAdapter.java:384) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.reteoo.AlphaNode.assertObject(AlphaNode.java:134) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.reteoo.CompositeObjectSinkAdapter.doPropagateAssertObject(CompositeObjectSinkAdapter.java:494) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.reteoo.CompositeObjectSinkAdapter.propagateAssertObject(CompositeObjectSinkAdapter.java:384) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.reteoo.AlphaNode.assertObject(AlphaNode.java:134) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.reteoo.CompositeObjectSinkAdapter.doPropagateAssertObject(CompositeObjectSinkAdapter.java:494) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.reteoo.CompositeObjectSinkAdapter.propagateAssertObject(CompositeObjectSinkAdapter.java:384) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.reteoo.ObjectTypeNode.propagateAssert(ObjectTypeNode.java:304) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.phreak.PropagationEntry$Insert.execute(PropagationEntry.java:134) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.phreak.SynchronizedPropagationList.flush(SynchronizedPropagationList.java:86) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.phreak.SynchronizedPropagationList.flush(SynchronizedPropagationList.java:81) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.impl.StatefulKnowledgeSessionImpl.flushPropagations(StatefulKnowledgeSessionImpl.java:2105) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.common.DefaultAgenda.fireLoop(DefaultAgenda.java:1296) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.common.DefaultAgenda.fireUntilHalt(DefaultAgenda.java:1232) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.impl.StatefulKnowledgeSessionImpl.fireUntilHalt(StatefulKnowledgeSessionImpl.java:1398) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at org.drools.core.impl.StatefulKnowledgeSessionImpl.fireUntilHalt(StatefulKnowledgeSessionImpl.java:1377) ~[drools-core-6.5.0.Final.jar:6.5.0.Final]
> at com.mot.ssol.cep.workflow.CEPSession.run(CEPSession.java:121) ~[mimonitor-cepm-3.0.jar:3.0]
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_102]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_102]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_102]
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (WFLY-8338) L1 should be disabled by default
by Paul Ferraro (JIRA)
[ https://issues.jboss.org/browse/WFLY-8338?page=com.atlassian.jira.plugin.... ]
Paul Ferraro moved JBEAP-9489 to WFLY-8338:
-------------------------------------------
Project: WildFly (was: JBoss Enterprise Application Platform)
Key: WFLY-8338 (was: JBEAP-9489)
Workflow: GIT Pull Request workflow (was: CDW with loose statuses v1)
Component/s: Clustering
(was: Clustering)
Affects Version/s: 10.1.0.Final
(was: 7.1.0.DR13)
> L1 should be disabled by default
> --------------------------------
>
> Key: WFLY-8338
> URL: https://issues.jboss.org/browse/WFLY-8338
> Project: WildFly
> Issue Type: Bug
> Components: Clustering
> Affects Versions: 10.1.0.Final
> Reporter: Paul Ferraro
> Assignee: Paul Ferraro
>
> Our default configuration for web and ejb uses locking reads. This makes L1 completely useless since remote calls to the primary owner would be necessary any time L1 would ever be referenced. Additionally, it incurs the cost of a separate invalidation command per write.
> Having it explicitly disabled in the default config makes it too tempting for uses to naively enable it, without knowing the cost.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (JBJCA-1341) Account for additional DB2 FATAL connection errors
by Ingo Weiss (JIRA)
[ https://issues.jboss.org/browse/JBJCA-1341?page=com.atlassian.jira.plugin... ]
Ingo Weiss reassigned JBJCA-1341:
---------------------------------
Assignee: Ingo Weiss
> Account for additional DB2 FATAL connection errors
> --------------------------------------------------
>
> Key: JBJCA-1341
> URL: https://issues.jboss.org/browse/JBJCA-1341
> Project: IronJacamar
> Issue Type: Enhancement
> Components: Validator
> Reporter: Ingo Weiss
> Assignee: Ingo Weiss
> Original Estimate: 2 days
> Remaining Estimate: 2 days
>
> Various version of pre 11.x DB2 drivers utilize the -99999 error code for a SQLException. Not all -99999 errors are fatal. For those variations that are known to be fatal, a check should be added to treat as such.
> One example would be the -99999 error that indicates "Connection is closed"
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (JBJCA-1341) Account for additional DB2 FATAL connection errors
by Ingo Weiss (JIRA)
[ https://issues.jboss.org/browse/JBJCA-1341?page=com.atlassian.jira.plugin... ]
Ingo Weiss reassigned JBJCA-1341:
---------------------------------
Assignee: (was: Ingo Weiss)
> Account for additional DB2 FATAL connection errors
> --------------------------------------------------
>
> Key: JBJCA-1341
> URL: https://issues.jboss.org/browse/JBJCA-1341
> Project: IronJacamar
> Issue Type: Enhancement
> Components: Validator
> Reporter: Ingo Weiss
> Original Estimate: 2 days
> Time Spent: 2 days
> Remaining Estimate: 0 minutes
>
> Various version of pre 11.x DB2 drivers utilize the -99999 error code for a SQLException. Not all -99999 errors are fatal. For those variations that are known to be fatal, a check should be added to treat as such.
> One example would be the -99999 error that indicates "Connection is closed"
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (JBJCA-1341) Account for additional DB2 FATAL connection errors
by Ingo Weiss (JIRA)
Ingo Weiss created JBJCA-1341:
---------------------------------
Summary: Account for additional DB2 FATAL connection errors
Key: JBJCA-1341
URL: https://issues.jboss.org/browse/JBJCA-1341
Project: IronJacamar
Issue Type: Enhancement
Components: Validator
Reporter: Ingo Weiss
Various version of pre 11.x DB2 drivers utilize the -99999 error code for a SQLException. Not all -99999 errors are fatal. For those variations that are known to be fatal, a check should be added to treat as such.
One example would be the -99999 error that indicates "Connection is closed"
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month
[JBoss JIRA] (WFLY-8337) Distributed web sessions and SFSBs should use SYNC cache mode by default
by Paul Ferraro (JIRA)
[ https://issues.jboss.org/browse/WFLY-8337?page=com.atlassian.jira.plugin.... ]
Paul Ferraro moved JBEAP-9487 to WFLY-8337:
-------------------------------------------
Project: WildFly (was: JBoss Enterprise Application Platform)
Key: WFLY-8337 (was: JBEAP-9487)
Workflow: GIT Pull Request workflow (was: CDW with loose statuses v1)
Component/s: Clustering
(was: Clustering)
Affects Version/s: 10.1.0.Final
(was: 7.1.0.DR13)
> Distributed web sessions and SFSBs should use SYNC cache mode by default
> ------------------------------------------------------------------------
>
> Key: WFLY-8337
> URL: https://issues.jboss.org/browse/WFLY-8337
> Project: WildFly
> Issue Type: Bug
> Components: Clustering
> Affects Versions: 10.1.0.Final
> Reporter: Paul Ferraro
> Assignee: Paul Ferraro
>
> For SFSBs, ASYNC mode requres pessimistic locking/repeatable read.
> In order to use looser locking isolation (since with SFSB, contention is not an issue), SYNC mode is more appropriate as a default.
--
This message was sent by Atlassian JIRA
(v7.2.3#72005)
9 years, 1 month