[JBoss JIRA] (JGRP-1957) S3_PING: Nodes never removed from .list file
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/JGRP-1957?page=com.atlassian.jira.plugin.... ]
Bela Ban commented on JGRP-1957:
--------------------------------
Ah, I see what you mean now... Re-adding discovery info from {{FILE_PING}} to the logical address cache only happens for TCP, not for UDP.
I suggest reduce {{logical_addr_cache_expiration}} to be below {{MERGE3.max_interval}}.
There's also a method (exposed via JMX or probe) {{TCP.evictLogicalAddressCache(boolean force)}}: if {{force}} is set to true, all expired elements will be removed immediately.
> S3_PING: Nodes never removed from .list file
> --------------------------------------------
>
> Key: JGRP-1957
> URL: https://issues.jboss.org/browse/JGRP-1957
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.6.4
> Environment: JGroups client running on Mac OS X - Yosemite
> JDK 1.7.71
> Reporter: Nick Sawadsky
> Assignee: Bela Ban
> Priority: Minor
> Fix For: 3.6.5
>
>
> I'm not 100% sure, but it seems like there might be a defect here.
> I'm using TCP, S3_PING, and MERGE3.
> I've set logical_addr_cache_max_size to 2 for testing purposes, although I don't think the value of this setting affects my test results.
> I start a single node, node A. Then I start a second node, node B.
> I then repeatedly shutdown and restart node B.
> Each time node B starts, a new row is added to the .list file stored in S3.
> But even if I continue this process for 15 minutes, old rows are never removed from the .list file, so it continues to grow in size.
> I've read the docs and mailing list threads, so I'm aware that the list is not immediately updated as soon as a member leaves. But I was expecting that when a view change occurs, nodes no longer in the view would be marked for removal (line 2193 of TP.java) and then after the logical_addr_cache_expiration has been reached and the reaper kicks in, once a new node joins, the expired cache entries would be purged from the file.
> I dug in to the code a bit, and what seems to be happening is that the MERGE3 protocol periodically generates a FIND_MBRS event. S3_PING retrieves the membership from the .list file, which includes expired nodes. And then all of these members are re-added to the logical address cache (line 157 of S3_PING.java, line 533 of Discovery.java, line 2263 of TP.java).
> So expired nodes are continually re-added to the logical address cache, preventing them from ever being reaped.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 2 months
[JBoss JIRA] (JGRP-1954) SWIFT_PING discovery protocol fatal error on OpenStack Kilo
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/JGRP-1954?page=com.atlassian.jira.plugin.... ]
Bela Ban updated JGRP-1954:
---------------------------
Fix Version/s: 3.6.6
(was: 3.6.5)
> SWIFT_PING discovery protocol fatal error on OpenStack Kilo
> -----------------------------------------------------------
>
> Key: JGRP-1954
> URL: https://issues.jboss.org/browse/JGRP-1954
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.6.4
> Environment: JGroups client running on Mac OS X - Yosemite
> JDK 1.7.71
> OpenStack Kilo
> Reporter: Nick Sawadsky
> Assignee: Bela Ban
> Fix For: 3.6.6
>
>
> I'm attempting to use the SWIFT_PING discovery protocol on the most recent version of OpenStack, "Kilo". An error occurs during initialization of the protocol stack, the stack trace is provided below.
> The problem appears to be that support for XML-formatted responses has been removed in the OpenStack Identity API (http://developer.openstack.org/api-ref-identity-v2.html). Even though SWIFT_PING sends an Accept header of application/xml, the response still comes back as JSON (around line 286 of SWIFT_PING.java).
> I've been able to repro the issue using Postman in Chrome. I tried providing the *request* in XML , with a Content-Type header of application/xml, but Swift returns an error: "Expecting to find application/json in Content-Type header".
> It seems like the resolution would be for SWIFT_PING to be modified so it can parse the JSON response that it is receiving. If that sounds like a reasonable approach, I can try to create a patch that fixes the issue.
> Stack Trace:
> 2015-08-21 14:30:16,123 FATAL [com.pingidentity.common.util.ErrorHandler] Problem creating factory for multiplexed cluster communications
> org.xml.sax.SAXParseException: Content is not allowed in prolog.
> at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257) ~[?:1.8.0_25]
> at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:348) ~[?:1.8.0_25]
> at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121) ~[?:1.8.0_25]
> at org.jgroups.protocols.SWIFT_PING$Keystone_V_2_0_Auth.authenticate(SWIFT_PING.java:307) ~[jgroups.jar:3.6.4.Final]
> at org.jgroups.protocols.SWIFT_PING$SwiftClient.authenticate(SWIFT_PING.java:443) ~[jgroups.jar:3.6.4.Final]
> at org.jgroups.protocols.SWIFT_PING.init(SWIFT_PING.java:68) ~[jgroups.jar:3.6.4.Final]
> at org.jgroups.stack.ProtocolStack.initProtocolStack(ProtocolStack.java:860) ~[jgroups.jar:3.6.4.Final]
> at org.jgroups.stack.ProtocolStack.setup(ProtocolStack.java:481) ~[jgroups.jar:3.6.4.Final]
> at org.jgroups.JChannel.init(JChannel.java:854) ~[jgroups.jar:3.6.4.Final]
> at org.jgroups.JChannel.<init>(JChannel.java:159) ~[jgroups.jar:3.6.4.Final]
> at org.jgroups.JChannel.<init>(JChannel.java:120) ~[jgroups.jar:3.6.4.Final]
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 2 months
[JBoss JIRA] (JGRP-1954) SWIFT_PING discovery protocol fatal error on OpenStack Kilo
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/JGRP-1954?page=com.atlassian.jira.plugin.... ]
Bela Ban commented on JGRP-1954:
--------------------------------
My preference is to leave {{SWIFT_PING}} in JGroups proper *if* we can come up with a solution that works and doesn't introduce new dependencies. Looks like Thomas suggested such a solution. The question is, will you Nick be able to adapt your suggested fix to take Thomas' solution into account ?
In any case, I'm moving this into 3.6.6. If we have a fix before the end of this week, we can still move it back to 3.6.5.
> SWIFT_PING discovery protocol fatal error on OpenStack Kilo
> -----------------------------------------------------------
>
> Key: JGRP-1954
> URL: https://issues.jboss.org/browse/JGRP-1954
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.6.4
> Environment: JGroups client running on Mac OS X - Yosemite
> JDK 1.7.71
> OpenStack Kilo
> Reporter: Nick Sawadsky
> Assignee: Bela Ban
> Fix For: 3.6.6
>
>
> I'm attempting to use the SWIFT_PING discovery protocol on the most recent version of OpenStack, "Kilo". An error occurs during initialization of the protocol stack, the stack trace is provided below.
> The problem appears to be that support for XML-formatted responses has been removed in the OpenStack Identity API (http://developer.openstack.org/api-ref-identity-v2.html). Even though SWIFT_PING sends an Accept header of application/xml, the response still comes back as JSON (around line 286 of SWIFT_PING.java).
> I've been able to repro the issue using Postman in Chrome. I tried providing the *request* in XML , with a Content-Type header of application/xml, but Swift returns an error: "Expecting to find application/json in Content-Type header".
> It seems like the resolution would be for SWIFT_PING to be modified so it can parse the JSON response that it is receiving. If that sounds like a reasonable approach, I can try to create a patch that fixes the issue.
> Stack Trace:
> 2015-08-21 14:30:16,123 FATAL [com.pingidentity.common.util.ErrorHandler] Problem creating factory for multiplexed cluster communications
> org.xml.sax.SAXParseException: Content is not allowed in prolog.
> at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257) ~[?:1.8.0_25]
> at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:348) ~[?:1.8.0_25]
> at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121) ~[?:1.8.0_25]
> at org.jgroups.protocols.SWIFT_PING$Keystone_V_2_0_Auth.authenticate(SWIFT_PING.java:307) ~[jgroups.jar:3.6.4.Final]
> at org.jgroups.protocols.SWIFT_PING$SwiftClient.authenticate(SWIFT_PING.java:443) ~[jgroups.jar:3.6.4.Final]
> at org.jgroups.protocols.SWIFT_PING.init(SWIFT_PING.java:68) ~[jgroups.jar:3.6.4.Final]
> at org.jgroups.stack.ProtocolStack.initProtocolStack(ProtocolStack.java:860) ~[jgroups.jar:3.6.4.Final]
> at org.jgroups.stack.ProtocolStack.setup(ProtocolStack.java:481) ~[jgroups.jar:3.6.4.Final]
> at org.jgroups.JChannel.init(JChannel.java:854) ~[jgroups.jar:3.6.4.Final]
> at org.jgroups.JChannel.<init>(JChannel.java:159) ~[jgroups.jar:3.6.4.Final]
> at org.jgroups.JChannel.<init>(JChannel.java:120) ~[jgroups.jar:3.6.4.Final]
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 2 months
[JBoss JIRA] (WFLY-5235) CDI interceptors are not called when invoking observer method
by Martin Kouba (JIRA)
[ https://issues.jboss.org/browse/WFLY-5235?page=com.atlassian.jira.plugin.... ]
Martin Kouba commented on WFLY-5235:
------------------------------------
I did a quick test (using {{TransactionSynchronizationRegistry}}) and it seems the interceptor is called and transaction is active. Therefore, we really need some more information. I wonder whether this could be a timing issue - for a web application the event is fired during {{ServletContextListener.contextInitialized()}} delivery and some services might not be available/fully initialized at this time (e.g. JPA).
> CDI interceptors are not called when invoking observer method
> -------------------------------------------------------------
>
> Key: WFLY-5235
> URL: https://issues.jboss.org/browse/WFLY-5235
> Project: WildFly
> Issue Type: Bug
> Components: CDI / Weld
> Affects Versions: 9.0.1.Final
> Reporter: Dirk Weil
> Assignee: Martin Kouba
>
> The following code runs with an active transaction on WFLY 8.2.0, but failes with an TransactionRequiredException on WFLY 9.0.1:
> @ApplicationScoped
> public class InitCocktailDemoDataService
> {
> @PersistenceContext
> private EntityManager entityManager;
> @Transactional
> private void createDemoData(@Observes @Initialized(ApplicationScoped.class) Object event)
> {
> this.entityManager.merge(someEntity);
> }
> It seems that interceptors aren't called at all - at least for observers of scope lifecycle events.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 2 months
[JBoss JIRA] (DROOLS-900) java.lang.IncompatibleClassChangeError error while using HashSets
by Mario Fusco (JIRA)
[ https://issues.jboss.org/browse/DROOLS-900?page=com.atlassian.jira.plugin... ]
Mario Fusco resolved DROOLS-900.
--------------------------------
Fix Version/s: 6.3.0.Final
Resolution: Done
Fixed by https://github.com/droolsjbpm/drools/commit/a2ab9d4b9
One minor performance suggestion: if I'm understanding correctly you're creating exactly the same constant sets of values again and again at each constraint evaluation. Consider to refactor this sets as static and final public fields of a Java class and refer to this fields in your constraints.
> java.lang.IncompatibleClassChangeError error while using HashSets
> -----------------------------------------------------------------
>
> Key: DROOLS-900
> URL: https://issues.jboss.org/browse/DROOLS-900
> Project: Drools
> Issue Type: Bug
> Components: core engine
> Affects Versions: 5.6.0.Final
> Reporter: Pravasis Pattnaik
> Assignee: Mario Fusco
> Priority: Minor
> Fix For: 6.3.0.Final
>
> Attachments: compiled-drl.txt, req.txt
>
>
> In the compiled drl file we were using (Sets.newHashSet(e1,e2,..)).contains(input) then we get an exception java.lang.IncompatibleClassChangeError: Class XYZ does not implement the requested interface java.util.Collection on large input sets.Here XYZ is our input ruleRequest to be evaluated. If I change it to Sets.newHashSet(e1,e2,..).contains(input) it fixes everything.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 2 months
[JBoss JIRA] (JGRP-1958) RequestCorrelator "channel is not connected" error during shutdown
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/JGRP-1958?page=com.atlassian.jira.plugin.... ]
Bela Ban resolved JGRP-1958.
----------------------------
Resolution: Done
> RequestCorrelator "channel is not connected" error during shutdown
> ------------------------------------------------------------------
>
> Key: JGRP-1958
> URL: https://issues.jboss.org/browse/JGRP-1958
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.2.12
> Reporter: Dennis Reed
> Assignee: Bela Ban
> Priority: Minor
> Fix For: 3.6.5
>
>
> Error logged during shutdown of a channel due to RequestCorrelator failing to send a reply:
> ERROR [org.jgroups.protocols.UNICAST2] (OOB-17,shared=tcp) couldn't deliver OOB message [dst: server1/web, src: server2/web (4 headers), size=62 bytes, flags=OOB|DONT_BUNDLE|RSVP]: java.lang.IllegalStateException: channel is not connected
> at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:617) [jgroups-3.2.12.Final-redhat-1.jar:3.2.12.Final-redhat-1]
> at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:544) [jgroups-3.2.12.Final-redhat-1.jar:3.2.12.Final-redhat-1]
> at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:391) [jgroups-3.2.12.Final-redhat-1.jar:3.2.12.Final-redhat-1]
> at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:249) [jgroups-3.2.12.Final-redhat-1.jar:3.2.12.Final-redhat-1]
> at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:600) [jgroups-3.2.12.Final-redhat-1.jar:3.2.12.Final-redhat-1]
> [incoming JGroups message]
> It appears to just be a timing issue between shutdown of the channel and RequestCorrelator processing the message, which triggers a response message.
> It would be good to either avoid triggering the exception in the first place, or suppress the error log during shutdown.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 2 months
[JBoss JIRA] (JGRP-1958) RequestCorrelator "channel is not connected" error during shutdown
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/JGRP-1958?page=com.atlassian.jira.plugin.... ]
Bela Ban updated JGRP-1958:
---------------------------
Fixed in {{UNICAST2}} on master by removing the stack trace in the WARN log message. If a backport is needed, please cherry-pick the changes yourself.
> RequestCorrelator "channel is not connected" error during shutdown
> ------------------------------------------------------------------
>
> Key: JGRP-1958
> URL: https://issues.jboss.org/browse/JGRP-1958
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.2.12
> Reporter: Dennis Reed
> Assignee: Bela Ban
> Priority: Minor
> Fix For: 3.6.5
>
>
> Error logged during shutdown of a channel due to RequestCorrelator failing to send a reply:
> ERROR [org.jgroups.protocols.UNICAST2] (OOB-17,shared=tcp) couldn't deliver OOB message [dst: server1/web, src: server2/web (4 headers), size=62 bytes, flags=OOB|DONT_BUNDLE|RSVP]: java.lang.IllegalStateException: channel is not connected
> at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:617) [jgroups-3.2.12.Final-redhat-1.jar:3.2.12.Final-redhat-1]
> at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:544) [jgroups-3.2.12.Final-redhat-1.jar:3.2.12.Final-redhat-1]
> at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:391) [jgroups-3.2.12.Final-redhat-1.jar:3.2.12.Final-redhat-1]
> at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:249) [jgroups-3.2.12.Final-redhat-1.jar:3.2.12.Final-redhat-1]
> at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:600) [jgroups-3.2.12.Final-redhat-1.jar:3.2.12.Final-redhat-1]
> [incoming JGroups message]
> It appears to just be a timing issue between shutdown of the channel and RequestCorrelator processing the message, which triggers a response message.
> It would be good to either avoid triggering the exception in the first place, or suppress the error log during shutdown.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 2 months
[JBoss JIRA] (JGRP-1957) S3_PING: Nodes never removed from .list file
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/JGRP-1957?page=com.atlassian.jira.plugin.... ]
Bela Ban commented on JGRP-1957:
--------------------------------
This is similar to https://issues.jboss.org/browse/JGRP-1956: nodes cannot be removed from the {{x.list}} file because they might not have crashed, but were separated away by a network partition.
If you enable the 2 attributes I listed in JGRP-1956 and reduce the size and expiration time of the logical address cache, the old members will get removed. Just tested this locally.
Can you close the issue if this works for you ? Otherwise, contact me on IRC, as I want to release a 3.6.5 before before the end of the week. Otherwise, this issue gets pushed into 3.6.6.
> S3_PING: Nodes never removed from .list file
> --------------------------------------------
>
> Key: JGRP-1957
> URL: https://issues.jboss.org/browse/JGRP-1957
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.6.4
> Environment: JGroups client running on Mac OS X - Yosemite
> JDK 1.7.71
> Reporter: Nick Sawadsky
> Assignee: Bela Ban
> Priority: Minor
> Fix For: 3.6.5
>
>
> I'm not 100% sure, but it seems like there might be a defect here.
> I'm using TCP, S3_PING, and MERGE3.
> I've set logical_addr_cache_max_size to 2 for testing purposes, although I don't think the value of this setting affects my test results.
> I start a single node, node A. Then I start a second node, node B.
> I then repeatedly shutdown and restart node B.
> Each time node B starts, a new row is added to the .list file stored in S3.
> But even if I continue this process for 15 minutes, old rows are never removed from the .list file, so it continues to grow in size.
> I've read the docs and mailing list threads, so I'm aware that the list is not immediately updated as soon as a member leaves. But I was expecting that when a view change occurs, nodes no longer in the view would be marked for removal (line 2193 of TP.java) and then after the logical_addr_cache_expiration has been reached and the reaper kicks in, once a new node joins, the expired cache entries would be purged from the file.
> I dug in to the code a bit, and what seems to be happening is that the MERGE3 protocol periodically generates a FIND_MBRS event. S3_PING retrieves the membership from the .list file, which includes expired nodes. And then all of these members are re-added to the logical address cache (line 157 of S3_PING.java, line 533 of Discovery.java, line 2263 of TP.java).
> So expired nodes are continually re-added to the logical address cache, preventing them from ever being reaped.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 2 months
[JBoss JIRA] (JGRP-1956) S3_PING / FILE_PING: remove failed members
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/JGRP-1956?page=com.atlassian.jira.plugin.... ]
Bela Ban resolved JGRP-1956.
----------------------------
Resolution: Won't Fix
I'm closing this as the instructions in my last comment should fix this issue. If this is not the case, please re-open.
> S3_PING / FILE_PING: remove failed members
> ------------------------------------------
>
> Key: JGRP-1956
> URL: https://issues.jboss.org/browse/JGRP-1956
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.6.4
> Reporter: Karsten Ohme
> Assignee: Bela Ban
> Fix For: 3.6.5
>
>
> When we terminate a member (EC2's "terminate" function) or kill -9 it, then the file (or bucket data in S3) won't get removed. This leads to stale data. On EC2, I expect that virtualized instances are often simply terminated, so this problem is compounded there.
> SOLUTION:
> - Periodically write own data to the file system (FILE_PING) or S3 (S3_PING)
> - On a view change: remove all data that's not in the current view
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 2 months
[JBoss JIRA] (JGRP-1956) S3_PING / FILE_PING: remove failed members
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/JGRP-1956?page=com.atlassian.jira.plugin.... ]
Bela Ban edited comment on JGRP-1956 at 9/2/15 2:47 AM:
--------------------------------------------------------
Can you try with the following attributes enabled: ?
* {{remove_old_coords_on_view_change}}
* {{remove_all_files_on_view_change}}
The reason old members are not immediately removed is that these members could have been split away, in a network partition, rather than crashed. If we want a merge to succeed in such a case, it is better to leave information about them in the store.
Note that {{TP.logical_addr_cache_max_size}} and {{TP.logical_addr_cache_expiration}} govern when stale entries will be removed. By default, you won't have more than 2000 stale elements in the cache.
Take a look at https://issues.jboss.org/browse/JGRP-1917 for details.
Doc: http://www.jgroups.org/manual/index.html#FILE_PING (removal of zombie files)
was (Author: belaban):
Can you try with the following attributes enabled: ?
* {{remove_old_coords_on_view_change}}
* {{remove_all_files_on_view_change}}
The reason old members are not immediately removed is that these members could have been split away, in a network partition, rather than crashed. If we want a merge to succeed in such a case, it is better to leave information about them in the store.
Note that {{TP.logical_addr_cache_max_size}} and {{TP.logical_addr_cache_expiration}} govern when stale entries will be removed. By default, you won't have more than 2000 stale elements in the cache.
Take a look at https://issues.jboss.org/browse/JGRP-1917 for details
> S3_PING / FILE_PING: remove failed members
> ------------------------------------------
>
> Key: JGRP-1956
> URL: https://issues.jboss.org/browse/JGRP-1956
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.6.4
> Reporter: Karsten Ohme
> Assignee: Bela Ban
> Fix For: 3.6.5
>
>
> When we terminate a member (EC2's "terminate" function) or kill -9 it, then the file (or bucket data in S3) won't get removed. This leads to stale data. On EC2, I expect that virtualized instances are often simply terminated, so this problem is compounded there.
> SOLUTION:
> - Periodically write own data to the file system (FILE_PING) or S3 (S3_PING)
> - On a view change: remove all data that's not in the current view
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
10 years, 2 months