[JBoss JIRA] (WFWIP-167) EAP Operator handling ConfigMap internally
by Martin Choma (Jira)
[ https://issues.jboss.org/browse/WFWIP-167?page=com.atlassian.jira.plugin.... ]
Martin Choma commented on WFWIP-167:
------------------------------------
Isn't this externalizing configuration denying infrastucture immutability concept?
So far we have overriding feature in s2i [1], but that keep infrastructure immutability pattern. But current Operator does not have s2i support.
Also note as there will be only one Operator at a time, we cant remove StandaloneConfigMapSpec, when s2i will be part of operator. So now question is is it worth adding it in current state?
[1] https://access.redhat.com/documentation/en-us/red_hat_jboss_enterprise_ap...
> EAP Operator handling ConfigMap internally
> ------------------------------------------
>
> Key: WFWIP-167
> URL: https://issues.jboss.org/browse/WFWIP-167
> Project: WildFly WIP
> Issue Type: Bug
> Components: OpenShift
> Reporter: Martin Choma
> Assignee: Jeff Mesnil
> Priority: Major
>
> If I understand description in [1] correctly. To specify custom standalone.xml I have to create ConfigMap with standalone.xml first and afterwards link operator to this ConfigMap.
> Is it possible to handle creation of ConfigMap and storing standalone.xml for me? Ideally I just specify file URI where custom standalone.xml is located. This location have to be accessible from operator pod. In this way we can look at it as hiding internals (implementation details) from users.
> Currently when user wants to change standalone.xml he does in ConfigMap, not operator. When changing standalone.xml through config map, I assume pod have to be restarted manually. Operator could do that for me.
> However this can be triggered by storing newer version of standalone.xml under another key, eg `standalone.xml.v2` and changing `StandaloneConfigMapSpec.key` in operator.
> What do you think? Have you considered this approach?
> [1] https://github.com/wildfly/wildfly-operator/blob/master/doc/apis.adoc#sta...
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 8 months
[JBoss JIRA] (JGRP-2362) Providing logical member name in JDBC_PING
by Bela Ban (Jira)
[ https://issues.jboss.org/browse/JGRP-2362?page=com.atlassian.jira.plugin.... ]
Bela Ban commented on JGRP-2362:
--------------------------------
I fail to understand what your problem is. Can you illustrate this with a setp-by-step example?
> Providing logical member name in JDBC_PING
> ------------------------------------------
>
> Key: JGRP-2362
> URL: https://issues.jboss.org/browse/JGRP-2362
> Project: JGroups
> Issue Type: Feature Request
> Affects Versions: 4.0.17, 4.0.18, 4.0.19, 4.1.0, 4.0.20
> Reporter: S Pokutniy
> Assignee: Bela Ban
> Priority: Minor
> Fix For: 4.1.2
>
>
> When using JDBC_PING and logical names instead of UUIDs and one of the cluster member crashes or get killed and this member is not coordinator then its database set still remains in the database as long as coordinator changes (independently from remove_old_coords_on_view_change /remove_all_data_on_view_change). If the the cluster is then restarted the old dataset makes connect() much slower (+30 seconds), as the members seem to be tryting to connect to it. Parameter remove_all_data_on_view_change seems to be the solution but it does not work as long as coordinator does not change, so practically the same as remove_old_coords_on_view_change.
> The only solution seems to be to provide an appropriate delete statement in parameter initialize_sql, which would delete old entry, for example like this: delete from JGROUPSPING where ping_data like '%logical name%'. However, this is neither really quick nor the ideal solution, as ping_data's datatype is bytea or bit varying.
> It would be great to have also logical name in JGROUPSPING, which is instead per default in insert(). This is also easy to implement as there is access to this information in PingData.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 8 months
[JBoss JIRA] (JGRP-2361) Error related to Jgroup and Database connection is getting reset
by Bela Ban (Jira)
[ https://issues.jboss.org/browse/JGRP-2361?page=com.atlassian.jira.plugin.... ]
Bela Ban edited comment on JGRP-2361 at 7/31/19 7:09 AM:
---------------------------------------------------------
Could be that you're running different versions of JGroups (cookie sent by /172.26.235.231:43565 does not match own cookie; terminating connection). What's your JGroups config (jgroups-tcp.xml)?
was (Author: belaban):
Could be that you're running different versions of JGroups (cookie sent by /172.26.235.231:43565 does not match own cookie; terminating connection). What's your JGroups config?
> Error related to Jgroup and Database connection is getting reset
> ----------------------------------------------------------------
>
> Key: JGRP-2361
> URL: https://issues.jboss.org/browse/JGRP-2361
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.6.11
> Environment: Hybris running on tomcat - Centos 7
> Reporter: karthikeyan Aruljothi
> Assignee: Bela Ban
> Priority: Major
> Attachments: Jgroup error in preprod-000.txt, Jgroups blocking and terminating connection.txt, Jgroups error in console.txt, error Jgroups.txt
>
>
> Hi ,
> we are facing an issue with our cluster configuration and due to this JVM responding time also takes more time, after clearing the cache / restarting all nodes application works as expected.
> When issue arises one of the core occupies 100% cpu utilization then it confirms to restart the server otherwise it never process any request. Below is our configuration in local.properties. Also providing error logs as attachment. could see error in logs related to Jgroups blocking and connection getting terminated between nodes.
> Let us know your valuable inputs, on what exactly the issue i.e causing the slowness then blocking the whole server.
> Attached cluster configuration for each nodes and error logs
> Adding to this we are getting below error while doing deployment/restarting of servers
> WARN [localhost-startStop-1] [GMS] hybrisnode-0: JOIN(hybrisnode-0) sent to hybrisnode-2 timed out (after 3000 ms), on try 3
> WARN [pool-3-thread-1] [GMS] hybrisnode-3: JOIN(hybrisnode-3) sent to hybrisnode-1 timed out (after 3000 ms), on try 4
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 8 months
[JBoss JIRA] (JGRP-2361) Error related to Jgroup and Database connection is getting reset
by Bela Ban (Jira)
[ https://issues.jboss.org/browse/JGRP-2361?page=com.atlassian.jira.plugin.... ]
Bela Ban commented on JGRP-2361:
--------------------------------
Could be that you're running different versions of JGroups (cookie sent by /172.26.235.231:43565 does not match own cookie; terminating connection). What's your JGroups config?
> Error related to Jgroup and Database connection is getting reset
> ----------------------------------------------------------------
>
> Key: JGRP-2361
> URL: https://issues.jboss.org/browse/JGRP-2361
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.6.11
> Environment: Hybris running on tomcat - Centos 7
> Reporter: karthikeyan Aruljothi
> Assignee: Bela Ban
> Priority: Major
> Attachments: Jgroup error in preprod-000.txt, Jgroups blocking and terminating connection.txt, Jgroups error in console.txt, error Jgroups.txt
>
>
> Hi ,
> we are facing an issue with our cluster configuration and due to this JVM responding time also takes more time, after clearing the cache / restarting all nodes application works as expected.
> When issue arises one of the core occupies 100% cpu utilization then it confirms to restart the server otherwise it never process any request. Below is our configuration in local.properties. Also providing error logs as attachment. could see error in logs related to Jgroups blocking and connection getting terminated between nodes.
> Let us know your valuable inputs, on what exactly the issue i.e causing the slowness then blocking the whole server.
> Attached cluster configuration for each nodes and error logs
> Adding to this we are getting below error while doing deployment/restarting of servers
> WARN [localhost-startStop-1] [GMS] hybrisnode-0: JOIN(hybrisnode-0) sent to hybrisnode-2 timed out (after 3000 ms), on try 3
> WARN [pool-3-thread-1] [GMS] hybrisnode-3: JOIN(hybrisnode-3) sent to hybrisnode-1 timed out (after 3000 ms), on try 4
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 8 months
[JBoss JIRA] (WFCORE-4525) Fix failing tests on IBM JDK
by Richard Opalka (Jira)
[ https://issues.jboss.org/browse/WFCORE-4525?page=com.atlassian.jira.plugi... ]
Richard Opalka commented on WFCORE-4525:
----------------------------------------
SilentModeTestCase started to pass in IBM JDK [~jamezp].
You meant that both
* CLIEmbedHostControllerTestCase
* CLIEmbedServerTestCase
should be annotated with @Ignore if IBM JDK is detected?
> Fix failing tests on IBM JDK
> ----------------------------
>
> Key: WFCORE-4525
> URL: https://issues.jboss.org/browse/WFCORE-4525
> Project: WildFly Core
> Issue Type: Bug
> Components: Logging
> Reporter: Richard Opalka
> Assignee: James Perkins
> Priority: Major
> Attachments: forked-booter.png, ibm-jdk8.png, oracle-jdk.png
>
>
> The following tests are failing on latest IBM JDK 8:
> ---
> # testsuite/standalone
> SilentModeTestCase
> # testsuite/manualmode
> CLIEmbedHostControllerTestCase
> CLIEmbedServerTestCase
> ---
> Tested on:
> ---
> java version "1.8.0_211"
> Java(TM) SE Runtime Environment (build 8.0.5.36 - pxa6480sr5fp36-20190510_01(SR5 FP36))
> IBM J9 VM (build 2.9, JRE 1.8.0 Linux amd64-64-Bit Compressed References 20190502_415899 (JIT enabled, AOT enabled)
> OpenJ9 - 46e57f9
> OMR - 06a046a
> IBM - 0b909bf)
> JCL - 20190409_01 based on Oracle jdk8u211-b25
> ---
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 8 months
[JBoss JIRA] (WFCORE-4525) Fix failing tests on IBM JDK
by Richard Opalka (Jira)
[ https://issues.jboss.org/browse/WFCORE-4525?page=com.atlassian.jira.plugi... ]
Richard Opalka edited comment on WFCORE-4525 at 7/31/19 6:54 AM:
-----------------------------------------------------------------
SilentModeTestCase started to pass in IBM JDK [~jamezp].
You meant that both
* CLIEmbedHostControllerTestCase
* CLIEmbedServerTestCase
should be annotated with @Ignore if IBM JDK is detected?
was (Author: ropalka):
SilentModeTestCase started to pass in IBM JDK [~jamezp].
You meant that both
* CLIEmbedHostControllerTestCase
* CLIEmbedServerTestCase
should be annotated with @Ignore if IBM JDK is detected?
> Fix failing tests on IBM JDK
> ----------------------------
>
> Key: WFCORE-4525
> URL: https://issues.jboss.org/browse/WFCORE-4525
> Project: WildFly Core
> Issue Type: Bug
> Components: Logging
> Reporter: Richard Opalka
> Assignee: James Perkins
> Priority: Major
> Attachments: forked-booter.png, ibm-jdk8.png, oracle-jdk.png
>
>
> The following tests are failing on latest IBM JDK 8:
> ---
> # testsuite/standalone
> SilentModeTestCase
> # testsuite/manualmode
> CLIEmbedHostControllerTestCase
> CLIEmbedServerTestCase
> ---
> Tested on:
> ---
> java version "1.8.0_211"
> Java(TM) SE Runtime Environment (build 8.0.5.36 - pxa6480sr5fp36-20190510_01(SR5 FP36))
> IBM J9 VM (build 2.9, JRE 1.8.0 Linux amd64-64-Bit Compressed References 20190502_415899 (JIT enabled, AOT enabled)
> OpenJ9 - 46e57f9
> OMR - 06a046a
> IBM - 0b909bf)
> JCL - 20190409_01 based on Oracle jdk8u211-b25
> ---
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 8 months
[JBoss JIRA] (JGRP-2364) simply lock and unlock JGroups lock repeatedly will create chaos
by Yong Deng (Jira)
[ https://issues.jboss.org/browse/JGRP-2364?page=com.atlassian.jira.plugin.... ]
Yong Deng commented on JGRP-2364:
---------------------------------
Following is the temporary fix on my side in org.jgroups.protocols.Locking.ClientLock#_unlock. I add 10 seconds timeout in order not to hang when we hit other JGroups bugs and RELEASE_LOCK_OK is not sent back ever.
{code:java}
protected synchronized void _unlock(boolean force) {
if(!acquired && !denied && !force)
return;
this.timeout=0;
this.is_trylock=false;
if (!denied) {
if (!force)
addToPendingReleaseRequests(this);
sendReleaseLockRequest(name, lock_id, owner);// lock will be released on RELEASE_LOCK_OK response
if (force && removeClientLock(name, owner))
notifyLockDeleted(name);
if (!force) {
//unlock will return only when get RELEASE_LOCK_OK or timeLeft after some seconds
long timeLeft = 10000;
while (acquired || denied) {
long start = System.currentTimeMillis();
try {
wait(timeLeft);
} catch (InterruptedException ie) {
break;
}
long duration = System.currentTimeMillis() - start;
if (duration > 0)
timeLeft -= duration;
if (timeLeft <= 0) {
if (log.isWarnEnabled()) {
log.warn(format("[%s]: timeout wait for RELEASE_LOCK_OK response for lock=%s", local_addr, this));
}
//defensive fix
acquired = denied = false;
break;
}
}
}
}
else
_unlockOK();
}
{code}
> simply lock and unlock JGroups lock repeatedly will create chaos
> ----------------------------------------------------------------
>
> Key: JGRP-2364
> URL: https://issues.jboss.org/browse/JGRP-2364
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.1.1
> Environment: JDK: 1.8
> JGroups: 4.1.1
> Lock: CENTRAL_LOCK
> Reporter: Yong Deng
> Assignee: Bela Ban
> Priority: Major
> Fix For: 4.1.2
>
> Attachments: LockSimpleTest.java
>
>
> I have one simple use case to reproduce the issue. In same thread, just lock/unlock the lock repeatedly. Turn the log level to TRACE, you will find the communication chaos between the client and the coordinate. *JGroups unlock will return immediately after sending out RELEASE_LOCK currently. Why unlock don’t wait and only return after receiving the RELEASE_LOCK_OK response?*
> * Current log:
> {code:java}
> 16:56:40,399 TRACE [CENTRAL_LOCK] A --> A: GRANT_LOCK[sample-lock, lock_id=1, owner=A::31, trylock, timeout=10000]
> 16:56:40,404 TRACE [CENTRAL_LOCK] A <-- A: GRANT_LOCK[sample-lock, lock_id=1, owner=A::31, trylock, timeout=10000, sender=A]
> 16:56:40,410 TRACE [CENTRAL_LOCK] A --> A: LOCK_GRANTED[sample-lock, lock_id=1, owner=A::31]
> 16:56:40,411 TRACE [CENTRAL_LOCK] A <-- A: LOCK_GRANTED[sample-lock, lock_id=1, owner=A::31, sender=A]
> 16:56:40,413 TRACE [CENTRAL_LOCK] A --> A: RELEASE_LOCK[sample-lock, lock_id=1, owner=A::31]
> 16:56:40,414 TRACE [CENTRAL_LOCK] A <-- A: RELEASE_LOCK[sample-lock, lock_id=1, owner=A::31, sender=A]
> 16:56:40,414 TRACE [CENTRAL_LOCK] A --> A: RELEASE_LOCK[sample-lock, lock_id=1, owner=A::31]
> 16:56:40,415 TRACE [CENTRAL_LOCK] A --> A: RELEASE_LOCK_OK[sample-lock, lock_id=1, owner=A::31]
> 16:56:40,415 TRACE [CENTRAL_LOCK] A --> A: RELEASE_LOCK[sample-lock, lock_id=1, owner=A::31]
> {code}
> * The expected log:
> {code:java}
> 2019-07-24 17:01:52,849 TRACE [org.jgroups.protocols.CENTRAL_LOCK] [A] --> [A] GRANT_LOCK [sample-lock, lock_id=1, owner=A::63, trylock (timeout=10000)
> 2019-07-24 17:01:52,849 TRACE [org.jgroups.protocols.CENTRAL_LOCK] [A] <-- [A] GRANT_LOCK [sample-lock, lock_id=1, owner=A::63, trylock (timeout=10000)
> 2019-07-24 17:01:52,852 TRACE [org.jgroups.protocols.CENTRAL_LOCK] [A] --> [A] LOCK_GRANTED [sample-lock, lock_id=1, owner=A::63 ]
> 2019-07-24 17:01:52,852 TRACE [org.jgroups.protocols.CENTRAL_LOCK] [A] <-- [A] LOCK_GRANTED [sample-lock, lock_id=1, owner=A::63 ]
> 2019-07-24 17:01:52,853 TRACE [org.jgroups.protocols.CENTRAL_LOCK] [A] --> [A] RELEASE_LOCK [sample-lock, lock_id=1, owner=A::63 ]
> 2019-07-24 17:01:52,853 TRACE [org.jgroups.protocols.CENTRAL_LOCK] [A] <-- [A] RELEASE_LOCK [sample-lock, lock_id=1, owner=A::63 ]
> 2019-07-24 17:01:52,853 TRACE [org.jgroups.protocols.CENTRAL_LOCK] [A] --> [A] RELEASE_LOCK_OK [sample-lock, lock_id=1, owner=A::63 ]
> 2019-07-24 17:01:52,854 TRACE [org.jgroups.protocols.CENTRAL_LOCK] [A] <-- [A] RELEASE_LOCK_OK [sample-lock, lock_id=1, owner=A::63 ]
> {code}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 8 months