June 2017 - jboss-jira - Jboss List Archives

[JBoss JIRA] (JGRP-2183) (7.0.z) DELIVERY_TIME: protocol to measure delivery times

by Bartosz Spyrko-Śmietanko (JIRA)

Bartosz Spyrko-Śmietanko created JGRP-2183: ---------------------------------------------- Summary: (7.0.z) DELIVERY_TIME: protocol to measure delivery times Key: JGRP-2183 URL: https://issues.jboss.org/browse/JGRP-2183 Project: JGroups Issue Type: Feature Request Reporter: Bartosz Spyrko-Śmietanko Assignee: Bela Ban Priority: Minor Fix For: 3.6.12, 4.0 This protocol should be placed at the top of the stack. It measure delivery times: * Average times for single messages to get delivered. This returns when {{receive()}} returns * Average times for message batches: the delivery time is computed as time to deliver the batch divided by batch size -- This message was sent by Atlassian JIRA (v7.2.3#72005)

9 years, 1 month

1
0
0 / 0

[JBoss JIRA] (JGRP-2182) (7.0.z) FD_SOCK is keep trying to create a new socket to the killed server

by Bartosz Spyrko-Śmietanko (JIRA)

Bartosz Spyrko-Śmietanko created JGRP-2182: ---------------------------------------------- Summary: (7.0.z) FD_SOCK is keep trying to create a new socket to the killed server Key: JGRP-2182 URL: https://issues.jboss.org/browse/JGRP-2182 Project: JGroups Issue Type: Bug Affects Versions: 3.6.3 Environment: JDG 6.6.0 (jgroups-3.6.3.Final-redhat-4.jar) Reporter: Bartosz Spyrko-Śmietanko Assignee: Bela Ban Fix For: 3.6.11, 4.0 In most cases FD_SOCK can detect a killed server immediately. But for unknown reason, FD_SOCK is keep trying to create a new socket to the killed server. As a consequence, installing a new cluster view is delayed until FD_ALL is triggered. m04_n007_server.log is showing the behaviour. There is 28 nodes (4 machines (m03, ..., m06) and 7 nodes (n001, ..., n007) on each) and all nodes on m03 are killed at the same time on 15:07:34,543. FD_SOCK is keep trying to connect to a killed node saying "socket address for m03_n001/clustered could not be fetched, retrying". {noformat} [n007] 15:07:39,543 TRACE [org.jgroups.protocols.FD_SOCK] (Timer-8,shared=udp) m04_n007/clustered: broadcasting SUSPECT message (suspected_mbrs=[m03_n005/clustered, m03_n007/clustered]) [n007] 15:07:39,544 TRACE [org.jgroups.protocols.FD_SOCK] (INT-20,shared=udp) m04_n007/clustered: received SUSPECT message from m04_n007/clustered: suspects=[m03_n005/clustered, m03_n007/clustered] [n007] 15:07:39,546 TRACE [org.jgroups.protocols.FD_SOCK] (FD_SOCK pinger,m04_n007/clustered) m04_n007/clustered: socket address for m03_n001/clustered could not be fetched, retrying [n007] 15:07:40,546 DEBUG [org.jgroups.protocols.FD_SOCK] (FD_SOCK pinger,m04_n007/clustered) m04_n007/clustered: ping_dest is m03_n001/clustered, pingable_mbrs=[m03_n001/clustered, m03_n002/clustered, m03_n003/clustered, m03_n004/clustered, m03_n006/clustered, m06_n001/clustered, m06_n002/clustered, m06_n003/clustered, m06_n004/clustered, m06_n005/clustered, m06_n006/clustered, m06_n007/clustered, m05_n001/clustered, m05_n002/clustered, m05_n003/clustered, m05_n004/clustered, m05_n005/clustered, m05_n006/clustered, m05_n007/clustered, m04_n001/clustered, m04_n002/clustered, m04_n003/clustered, m04_n004/clustered, m04_n005/clustered, m04_n006/clustered, m04_n007/clustered] [n007] 15:07:41,546 TRACE [org.jgroups.protocols.FD_SOCK] (FD_SOCK pinger,m04_n007/clustered) m04_n007/clustered: socket address for m03_n001/clustered could not be fetched, retrying [n007] 15:07:42,546 DEBUG [org.jgroups.protocols.FD_SOCK] (FD_SOCK pinger,m04_n007/clustered) m04_n007/clustered: ping_dest is m03_n001/clustered, pingable_mbrs=[m03_n001/clustered, m03_n002/clustered, m03_n003/clustered, m03_n004/clustered, m03_n006/clustered, m06_n001/clustered, m06_n002/clustered, m06_n003/clustered, m06_n004/clustered, m06_n005/clustered, m06_n006/clustered, m06_n007/clustered, m05_n001/clustered, m05_n002/clustered, m05_n003/clustered, m05_n004/clustered, m05_n005/clustered, m05_n006/clustered, m05_n007/clustered, m04_n001/clustered, m04_n002/clustered, m04_n003/clustered, m04_n004/clustered, m04_n005/clustered, m04_n006/clustered, m04_n007/clustered] [n007] 15:07:43,547 TRACE [org.jgroups.protocols.FD_SOCK] (FD_SOCK pinger,m04_n007/clustered) m04_n007/clustered: socket address for m03_n001/clustered could not be fetched, retrying ... [n007] 15:10:53,700 DEBUG [org.jgroups.protocols.FD_ALL] (Timer-26,shared=udp) haven't received a heartbeat from m03_n005/clustered for 200059 ms, adding it to suspect list {noformat} >From the TRACE log, you can find an address cache of FD_SOCK has only 23 members. {noformat} [n007] 14:40:50,471 TRACE [org.jgroups.protocols.FD_SOCK] (FD_SOCK pinger,m04_n007/clustered) m04_n007/clustered: got cache from m03_n005/clustered: cache is { m04_n006/clustered=172.20.66.34:9945, m05_n005/clustered=172.20.66.35:9938, m06_n004/clustered=172.20.66.36:9931, m03_n007/clustered=172.20.66.33:9952, m05_n001/clustered=172.20.66.35:9910, m06_n005/clustered=172.20.66.36:9938, m05_n006/clustered=172.20.66.35:9945, m03_n005/clustered=172.20.66.33:9938, m05_n004/clustered=172.20.66.35:9931, m04_n003/clustered=172.20.66.34:9924, m04_n007/clustered=172.20.66.34:9952, m05_n002/clustered=172.20.66.35:9917, m05_n003/clustered=172.20.66.35:9924, m04_n004/clustered=172.20.66.34:9931, m06_n001/clustered=172.20.66.36:9910, m06_n007/clustered=172.20.66.36:9952, m04_n005/clustered=172.20.66.34:9938, m04_n001/clustered=172.20.66.34:9910, m05_n007/clustered=172.20.66.35:9952, m06_n002/clustered=172.20.66.36:9917, m06_n006/clustered=172.20.66.36:9945, m04_n002/clustered=172.20.66.34:9917, m06_n003/clustered=172.20.66.36:9924} {noformat} While pingable_mbrs has all 28 members which is from the current available cluster view. {noformat} [n007] 14:40:50,472 DEBUG [org.jgroups.protocols.FD_SOCK] (FD_SOCK pinger,m04_n007/clustered) m04_n007/clustered: ping_dest is m03_n005/clustered, pingable_mbrs=[ m03_n005/clustered, m03_n007/clustered, m03_n001/clustered, m03_n002/clustered, m03_n003/clustered, m03_n004/clustered, m03_n006/clustered, m06_n001/clustered, m06_n002/clustered, m06_n003/clustered, m06_n004/clustered, m06_n005/clustered, m06_n006/clustered, m06_n007/clustered, m05_n001/clustered, m05_n002/clustered, m05_n003/clustered, m05_n004/clustered, m05_n005/clustered, m05_n006/clustered, m05_n007/clustered, m04_n001/clustered, m04_n002/clustered, m04_n003/clustered, m04_n004/clustered, m04_n005/clustered, m04_n006/clustered, m04_n007/clustered] {noformat} -- This message was sent by Atlassian JIRA (v7.2.3#72005)

9 years, 1 month

1
0
0 / 0

[JBoss JIRA] (JGRP-2181) (7.0.z) MERGE3: merge never happens

by Bartosz Spyrko-Śmietanko (JIRA)

Bartosz Spyrko-Śmietanko created JGRP-2181: ---------------------------------------------- Summary: (7.0.z) MERGE3: merge never happens Key: JGRP-2181 URL: https://issues.jboss.org/browse/JGRP-2181 Project: JGroups Issue Type: Bug Reporter: Bartosz Spyrko-Śmietanko Assignee: Bela Ban Fix For: 3.6.11, 4.0 (Reported by Neal Dillman) In the case below, a merge doesn't seem to happen. Write a unit test to reprodue this. {noformat} Host A view: B, X, Y, Z, A (where B should be coordinator) Host B view: C, Q, R, S, B (where C should be coordinator) Host C view: A, M, N, O, C (where A should be coordinator) {noformat} -- This message was sent by Atlassian JIRA (v7.2.3#72005)

9 years, 1 month

1
0
0 / 0

[JBoss JIRA] (JGRP-2180) (7.0.z) UNICAST3: bypass or remove when running over TCP

by Bartosz Spyrko-Śmietanko (JIRA)

Bartosz Spyrko-Śmietanko created JGRP-2180: ---------------------------------------------- Summary: (7.0.z) UNICAST3: bypass or remove when running over TCP Key: JGRP-2180 URL: https://issues.jboss.org/browse/JGRP-2180 Project: JGroups Issue Type: Enhancement Reporter: Bartosz Spyrko-Śmietanko Assignee: Bela Ban Priority: Minor Fix For: 3.6.11, 4.0 When running over TCP as transport, UNICAST3 is still required: while TCP/IP retransmits messages reliably and also provides sender-FIFO ordering, the receiver's thread pool might be exhausted and thus the message might get rejected. However, *if* the regular and OOB thread pools are disabled, we could actually bypass (or completely remove) UNICAST3. If messages get dropped by a protocol further up the stack, however, there will be no retransmission in this case. SOLUTION: * Document this behavior * Emit an INFO message (or automatically bypass UNICAST3) when run over a TCP transport and both OOB and regular pools are disabled -- This message was sent by Atlassian JIRA (v7.2.3#72005)

9 years, 1 month

1
0
0 / 0

[JBoss JIRA] (JGRP-2179) (7.0.z) SYM/ASYM_ENCRYPT: don't use WeakHashMap for old ciphers

by Bartosz Spyrko-Śmietanko (JIRA)

Bartosz Spyrko-Śmietanko created JGRP-2179: ---------------------------------------------- Summary: (7.0.z) SYM/ASYM_ENCRYPT: don't use WeakHashMap for old ciphers Key: JGRP-2179 URL: https://issues.jboss.org/browse/JGRP-2179 Project: JGroups Issue Type: Task Reporter: Bartosz Spyrko-Śmietanko Assignee: Bela Ban Priority: Minor Fix For: 3.6.11, 4.0 Currently we use WeakHashMap, but should not, reasons outlined below. We could replace it with a LazyRemovalCache. Andrew's email refers to SecretKeys but this probably also applies to Ciphers. Andrew Haley's email: {quote} TL/DR: Please don't use WeakReferences, SoftReferences, etc. to cache any data which might point to native memory. In particular, never do this with instances of java.security.Key. Instead, implement either some kind of ageing strategy or a fixed-size cache. ... This is a warning to anybody who might cache crypto keys. A customer has been having problems with the exhaustion of native memory before the Java heap is full. It was fun trying to track down the cause, but it's now happened several times to several customers, and it's a serious problem for real-world usage in app servers. PKCS#11 is a standard way to communicate between applications and crypto libraries. There is a Java crypto provider which supports PKCS#11. Some of our customers must use this provider in order to get FIPS certification. The problem is this: A crypto key is a buffer in memory, allocated by the PKCS#11 native library. It's accessed via a handle which is stored as an integer field in a Java object. This Java object is a PhantomReference, so when the garbage collector detects that a crypto key is no longer reachable it is closed and the associated native memory is freed. Modern garbage collectors don't much bother to process objects in the old generation because it's not usually worthwhile. Thus, crypto keys don't get recycled very quickly. They can pile up in the old generation. This isn't a problem for the Java heap because the objects containing the references to crypto keys are very small. Unfortunately, the native side of a crypto key is much bigger, maybe up to a thousand times bigger. So if we have 4000 stale crypto keys in the heap that's not a problem, a few kbytes. But the native memory may be a megabyte. This problem is made even worse by Tomcat because it uses SoftReferences to cache crypto keys. SoftReferences are processed lazily, and maybe not at all until the Java heap runs out of memory. Unfortunately it doesn't, but the machine runs out of native memory instead. We could solve this simply by making instances of PKCS#11 keys really big Java objects by padding with dummy fields. Then, the GC would collect them quickly. This does work but it seriously impacts performance. Also, we could tweak the garbage collectors to clear out stale references more enthusiastically, but this impacts performance even more. There are some controls with the G1 collector which process SoftReferences more aggressively and these help, but again at the cost of performance. Finally: the Shanandoah collector we're working on handles this problem much better than the older collectors, but it's some way off. {quote} -- This message was sent by Atlassian JIRA (v7.2.3#72005)

9 years, 1 month

1
0
0 / 0

[JBoss JIRA] (JGRP-2178) (7.0.z) Add convenience method Rsp.readIn

by Bartosz Spyrko-Śmietanko (JIRA)

Bartosz Spyrko-Śmietanko created JGRP-2178: ---------------------------------------------- Summary: (7.0.z) Add convenience method Rsp.readIn Key: JGRP-2178 URL: https://issues.jboss.org/browse/JGRP-2178 Project: JGroups Issue Type: Enhancement Affects Versions: 3.6.10, 4.0 Reporter: Bartosz Spyrko-Śmietanko Assignee: Radim Vansa Priority: Minor Fix For: 3.6.11, 4.0 In Infinispan, during a staggered get we prepare several {{Rsp}} s in {{RspList}} and then for each {{Rsp}} we send one message. As the {{RspList}} can be accessed by multiple threads but we don't want to synchronize the access, we just get the {{Rsp}} and fill it from the (other) received {{Rsp}}. However the fill requires several ifs: {code} if (rsp.hasException()) { futureRsp.setException(rsp.getException()); } else if (rsp.wasSuspected()) { futureRsp.setSuspected(); } else if (rsp.wasUnreachable()) { futureRsp.setUnreachable(); } else { futureRsp.setValue(rsp.getValue()); } {code} Let's add a convenience method that will just read in the flags and value. -- This message was sent by Atlassian JIRA (v7.2.3#72005)

9 years, 1 month

1
0
0 / 0

[JBoss JIRA] (JGRP-2177) (7.0.z) TYPE_STRING does not handle unicode

by Bartosz Spyrko-Śmietanko (JIRA)

Bartosz Spyrko-Śmietanko created JGRP-2177: ---------------------------------------------- Summary: (7.0.z) TYPE_STRING does not handle unicode Key: JGRP-2177 URL: https://issues.jboss.org/browse/JGRP-2177 Project: JGroups Issue Type: Bug Reporter: Bartosz Spyrko-Śmietanko Assignee: Bela Ban Priority: Minor Fix For: 3.6.11, 4.0 In several places throughout the org.jgroups.util.Util class, it is assumed that Strings are one byte per character. For example, see objectToByteBuffer lines 561-567: https://github.com/belaban/JGroups/blob/master/src/org/jgroups/util/Util.... {code:java} case TYPE_STRING: String str=(String)obj; int len=str.length(); ByteBuffer retval=ByteBuffer.allocate(Global.BYTE_SIZE + len).put(TYPE_STRING); for(int i=0; i < len; i++) retval.put((byte)str.charAt(i)); return retval.array(); {code} This code will incorrectly encode any String with non ASCII encoding. There are several options to fix. You could use str.getBytes(StandardCharsets.UTF_8) to get a proper byte encoding, or you could use the existing TYPE_SERIALIZABLE code path. -- This message was sent by Atlassian JIRA (v7.2.3#72005)

9 years, 1 month

1
0
0 / 0

[JBoss JIRA] (JGRP-2176) (7.0.z) CENTRAL_LOCK: potential deadlock after cluster split

by Bartosz Spyrko-Śmietanko (JIRA)

Bartosz Spyrko-Śmietanko created JGRP-2176: ---------------------------------------------- Summary: (7.0.z) CENTRAL_LOCK: potential deadlock after cluster split Key: JGRP-2176 URL: https://issues.jboss.org/browse/JGRP-2176 Project: JGroups Issue Type: Bug Affects Versions: 3.6.10 Reporter: Bartosz Spyrko-Śmietanko Assignee: Bela Ban Fix For: 3.6.11, 4.0 We encountered deadlocks in some rare situations where a cluster split and/or merged. The dealocks happend when using the UDP transport protocol and where caused by received lock requests from members not present in the current view. We fixed this with: https://github.com/belaban/JGroups/pull/311 -- This message was sent by Atlassian JIRA (v7.2.3#72005)

9 years, 1 month

1
0
0 / 0

[JBoss JIRA] (JGRP-2175) (7.0.z) IndexOutOfBoundsException when trace logging

by Bartosz Spyrko-Śmietanko (JIRA)

Bartosz Spyrko-Śmietanko created JGRP-2175: ---------------------------------------------- Summary: (7.0.z) IndexOutOfBoundsException when trace logging Key: JGRP-2175 URL: https://issues.jboss.org/browse/JGRP-2175 Project: JGroups Issue Type: Bug Affects Versions: 3.6.9 Reporter: Bartosz Spyrko-Śmietanko Assignee: Bela Ban Priority: Minor Fix For: 3.6.11 When running with trace logging, I got couple of these STs: {code} Exception in thread "OOB-1,test-NodeE-13479" java.lang.IndexOutOfBoundsException: Index: 4, Size: 2 at java.util.ArrayList.rangeCheck(ArrayList.java:653) at java.util.ArrayList.get(ArrayList.java:429) at org.jgroups.protocols.pbcast.NAKACK2.handleMessages(NAKACK2.java:868) at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:705) at org.jgroups.stack.Protocol.up(Protocol.java:425) at org.jgroups.protocols.TP.passBatchUp(TP.java:1600) at org.jgroups.protocols.TP$BatchHandler.run(TP.java:1820) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} Seems that part of the list of received messages is removed in handleMessages:864 in {code} boolean added=loopback || buf.add(msgs, oob, oob? DUMMY_OOB_MSG : null); {code} But the {{size}} is not recomputed afterwards. -- This message was sent by Atlassian JIRA (v7.2.3#72005)

9 years, 1 month

1
0
0 / 0

[JBoss JIRA] (WFCORE-2766) Application server must be reloaded when is updated credential reference of credential store. There isn't any information that it needs reload.

by Yeray Borges (JIRA)

[ https://issues.jboss.org/browse/WFCORE-2766?page=com.atlassian.jira.plugi... ] Yeray Borges commented on WFCORE-2766: -------------------------------------- The was discussed with [~pskopek] arriving at the following conclusions: * There might be different implementations of CS API which could be dynamically changed from outside and the reload not be required from WF point * Mark as reload-required resources which are referring other CS will create a mess between resources (e. g. resources being referred by other CS, which are being referred by other CS ...), even if they are fine because contain the same passwords. * Reloading on each alias update/remove of any CS, even if they are not being referred, is not an ideal situation For these reasons, is left to the user to take the decision if he needs to reload or doesn't after update an alias. This issue will be resolved once these two issues are merged: WFCORE-2426 and WFCORE-2867 Once those issues are merged, at least if the user updates the credential-reference of one CS, a reload will be required. > Application server must be reloaded when is updated credential reference of credential store. There isn't any information that it needs reload. > ----------------------------------------------------------------------------------------------------------------------------------------------- > > Key: WFCORE-2766 > URL: https://issues.jboss.org/browse/WFCORE-2766 > Project: WildFly Core > Issue Type: Bug > Components: Security > Reporter: Hynek Švábek > Assignee: Yeray Borges > > Application server must be reloaded when is updated credential reference of credential store. There isn't any information that it needs reload. > In model is "restart-required" => "no-services" and credential-reference update operation ends with success message without any information about reload. > {code:collapse} > "credential-reference" => { > "type" => OBJECT, > "description" => "Credential reference to be used to create protection parameter.", > "expressions-allowed" => false, > "required" => true, > "nillable" => false, > "access-constraints" => {"sensitive" => {"credential" => {"type" => "core"}}}, > "value-type" => { > "store" => { > "type" => STRING, > "description" => "The name of the credential store holding the alias to credential.", > "expressions-allowed" => false, > "required" => false, > "nillable" => true, > "capability-reference" => "org.wildfly.security.credential-store", > "min-length" => 1L, > "max-length" => 2147483647L > }, > "alias" => { > "type" => STRING, > "description" => "The alias which denotes stored secret or credential in the store.", > "expressions-allowed" => true, > "required" => false, > "nillable" => true, > "min-length" => 1L, > "max-length" => 2147483647L > }, > "type" => { > "type" => STRING, > "description" => "The type of credential this reference is denoting.", > "expressions-allowed" => true, > "required" => false, > "nillable" => true, > "min-length" => 1L, > "max-length" => 2147483647L > }, > "clear-text" => { > "type" => STRING, > "description" => "Secret specified using clear text. Check credential store way of supplying credential/secrets to services.", > "expressions-allowed" => true, > "required" => false, > "nillable" => true, > "min-length" => 1L, > "max-length" => 2147483647L > } > }, > "access-type" => "read-write", > "storage" => "configuration", > "restart-required" => "no-services" > }, > {code} -- This message was sent by Atlassian JIRA (v7.2.3#72005)

9 years, 1 month

1
0
0 / 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

jboss-jira June 2017