[JBoss JIRA] (WFLY-12342) Integrate server probes in MP Health readiness check
by Jeff Mesnil (Jira)
Jeff Mesnil created WFLY-12342:
----------------------------------
Summary: Integrate server probes in MP Health readiness check
Key: WFLY-12342
URL: https://issues.jboss.org/browse/WFLY-12342
Project: WildFly
Issue Type: Feature Request
Components: MP Health
Reporter: Jeff Mesnil
Assignee: Jeff Mesnil
MicroProfile Health 2.0 introduced a readiness health check.
WFLY-12228 added support for MicroProfile Health 2.0 without enhancing the readiness health check.
This RFE enhances it with 3 probes (that are part of the EAP readiness probe script at [1]):
* server-status - returns UP when the server-state is READY
* boot-errors - return UP if there are no boot-errors
* deployment-status - returns UP if status for all deployments is OK
[1] https://github.com/jboss-container-images/jboss-eap-modules/blob/251f422c...
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 8 months
[JBoss JIRA] (WFLY-12337) Wildfly Transaction REQUIRES_NEW not working
by Jive JIRA Integration (Jira)
[ https://issues.jboss.org/browse/WFLY-12337?page=com.atlassian.jira.plugin... ]
Jive JIRA Integration updated WFLY-12337:
-----------------------------------------
Forum Reference: https://developer.jboss.org/message/990310#990310
> Wildfly Transaction REQUIRES_NEW not working
> --------------------------------------------
>
> Key: WFLY-12337
> URL: https://issues.jboss.org/browse/WFLY-12337
> Project: WildFly
> Issue Type: Bug
> Components: JCA, Transactions
> Affects Versions: 16.0.0.Final
> Reporter: Hao Chen
> Assignee: Thomas Jenkinson
> Priority: Major
>
> We recently upgraded from jboss-eap-6.4 to Wildfly 16. Most things worked fine until we find one issue in the case that a method that is called using TransactionTemplate with TransactionDefinition.PROPAGATION_REQUIRES_NEW is not working...
> The case is that: we use this PROPAGATION_REQUIRES_NEW transaction to commit a log of login failure, while the main transaction is rolled back due login exception.
> This used to work without problem in jboss-eap-6.4 though we were seeing warnings like:
> Trying to change transaction TransactionImple < ac, BasicAction: 0:ffff0a004b01:513406f2:5d37ce0b:2a0 status: ActionStatus.RUNNING > in enlist!
> But in Wildfly 16 PROPAGATION_REQUIRES_NEW doesn't work any more -- the new transaction seems to rolled back together with the main transaction. After spending quite some time debugging into the Wildfly/Jboss code, I found that in org.jboss.jca.core.connectionmanager.listener.TxConnectionListener.enlist method, there is a new line:
> if (isEnlisted() || getState().equals(ConnectionState.DESTROY) || getState().equals(ConnectionState.DESTROYED))
> return;
> And in our case, the TxConnectionListener object is the same in the new transaction as in the old transaction, so it's already marked as "enlisted". And therefore the rest of enlist is never executed for the new transaction, so Wildfly never issued a "XA Start" command for the new transaction.
> So the question is: is the added isEnlisted() check a bug introduced or there is something wrong we did to configure Wildfly and/or Spring? How should the REUIRES_NEW case work?
> Here is a trace except I captured:
> TransactionTemplate.execute(TransactionCallback<T>) line: 130 JtaTransactionManager(AbstractPlatformTransactionManager).getTransaction(TransactionDefinition) line: 353 JtaTransactionManager(AbstractPlatformTransactionManager).handleExistingTransaction(TransactionDefinition, Object, boolean) line: 433
> JtaTransactionManager.doBegin(Object, TransactionDefinition) line: 831
> JtaTransactionManager.doJtaBegin(JtaTransactionObject, TransactionDefinition) line: 872
> LocalUserTransaction.begin() line: 48
> ContextTransactionManager.begin(CreationListener$CreatedBy) line: 62
> LocalTransactionContext.beginTransaction(int, boolean, CreationListener$CreatedBy) line: 188 JBossJTALocalTransactionProvider(JBossLocalTransactionProvider).createNewTransaction(int)
> ...
> ContextTransactionManager.resume(AbstractTransaction) line: 158
> LocalTransaction.resume() line: 248
> TransactionManagerDelegate(BaseTransactionManagerDelegate).resume(Transaction) line: 122
> TransactionManagerImple.resume(Transaction) line: 111
> AtomicAction.resume(AtomicAction) line: 361
> LocalTransaction(AbstractTransaction).notifyAssociationListeners(boolean) line: 115
> TransactionManagerService$2.associationChanged(AbstractTransaction, boolean) line: 97
> UserTransactionRegistry.userTransactionStarted() line: 119
> UserTransactionListenerImpl.userTransactionStarted() line: 52
> CachedConnectionManagerImpl.userTransactionStarted() line: 249
> TxConnectionManagerImpl.transactionStarted(Collection<ConnectionRecord>) line: 460
> TxConnectionListener.enlist() line: 264
> !-- changed in ironjacamar-core-impl-1.2
> // If we are already enlisted there is no reason to check again, as this method
> // could be called multiple times during a transaction lifecycle.
> // We know that we can only be inside this method if we are allowed to
> if (isEnlisted() || getState().equals(ConnectionState.DESTROY) || getState().equals(ConnectionState.DESTROYED))
> return;
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 8 months
[JBoss JIRA] (WFCORE-4296) Illegal reflective access by org.wildfly.extension.elytron.SSLDefinitions
by Jaikiran Pai (Jira)
[ https://issues.jboss.org/browse/WFCORE-4296?page=com.atlassian.jira.plugi... ]
Jaikiran Pai commented on WFCORE-4296:
--------------------------------------
[~heruan], maybe your issue is related to https://issues.jboss.org/browse/WFCORE-4585
If it's the same, then the workaround description in that JIRA might help you get past this.
> Illegal reflective access by org.wildfly.extension.elytron.SSLDefinitions
> -------------------------------------------------------------------------
>
> Key: WFCORE-4296
> URL: https://issues.jboss.org/browse/WFCORE-4296
> Project: WildFly Core
> Issue Type: Bug
> Components: Security
> Environment: Windows 7 x64. Java 11: OpenJDK 64-Bit Server VM AdoptOpenJDK (build 11+28-201810022317, mixed mode)
> Reporter: Marco Del Percio
> Priority: Major
> Labels: Java11, access, elytron, illegal, reflective, wildfly
>
> After configuring HTTPS using the following guide: [Enable One-way SSL/TLS for Applications|http://docs.wildfly.org/14/WildFly_Elytron_Security.html#con...], configuration seems ok and server boots fine however an illegal reflective access warning comes up from jar within Elytron:
> {color:red}
> WARNING: An illegal reflective access operation has occurred
> WARNING: Illegal reflective access by org.wildfly.extension.elytron.SSLDefinitions (jar:file:/D:/wildfly-14.0.1.Final_FleetManager/modules/system/layers/base/org/wildfly/extension/elytron/main/wildfly-elytron-integration-6.0.2.Final.jar!/) to method com.sun.net.ssl.internal.ssl.Provider.isFIPS()
> WARNING: Please consider reporting this to the maintainers of org.wildfly.extension.elytron.SSLDefinitions
> WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
> WARNING: All illegal access operations will be denied in a future release
> {color}
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 8 months
[JBoss JIRA] (JGRP-2361) Error related to Jgroup and Database connection is getting reset
by karthikeyan Aruljothi (Jira)
[ https://issues.jboss.org/browse/JGRP-2361?page=com.atlassian.jira.plugin.... ]
karthikeyan Aruljothi commented on JGRP-2361:
---------------------------------------------
I found this Jar added to our project : jgroups-3.6.11.Final.jar , so version 3.6.11
I attached jgroup-tcp.xml and configuration details for your reference.
[^Jgroup node configuration.txt]
[^jgroups-tcp.xml]
Also adding to it getting connect timedout exception as below from one of servers. we getting while starting of the servers.
08/01 01:58:30.300 | at org.jgroups.protocols.TransferQueueBundler.run(TransferQueueBundler.java:105) [jgroups-3.6.11.Final.jar:3.6.11.Final]
INFO | jvm 1 | main | 2019/08/01 01:58:30.300 | at java.lang.Thread.run(Thread.java:748) [?:1.8.0_211]
INFO | jvm 1 | main | 2019/08/01 01:58:30.600 | ERROR [TransferQueueBundler,hybris-broadcast,hybrisnode-0] [TCP] JGRP000036: hybrisnode-0: exception sending bundled msgs: java.net.SocketTimeoutException: connect timed out
INFO | jvm 1 | main | 2019/08/01 01:58:30.600 | java.net.SocketTimeoutException: connect timed out
INFO | jvm 1 | main | 2019/08/01 01:58:30.600 | at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_211]
INFO | jvm 1 | main | 2019/08/01 01:58:30.600 | at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_211]
INFO | jvm 1 | main | 2019/08/01 01:58:30.600 | at java.net.
> Error related to Jgroup and Database connection is getting reset
> ----------------------------------------------------------------
>
> Key: JGRP-2361
> URL: https://issues.jboss.org/browse/JGRP-2361
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.6.11
> Environment: Hybris running on tomcat - Centos 7
> Reporter: karthikeyan Aruljothi
> Assignee: Bela Ban
> Priority: Major
> Attachments: Jgroup error in preprod-000.txt, Jgroup node configuration.txt, Jgroups blocking and terminating connection.txt, Jgroups error in console.txt, error Jgroups.txt, jgroups-tcp.xml
>
>
> Hi ,
> we are facing an issue with our cluster configuration and due to this JVM responding time also takes more time, after clearing the cache / restarting all nodes application works as expected.
> When issue arises one of the core occupies 100% cpu utilization then it confirms to restart the server otherwise it never process any request. Below is our configuration in local.properties. Also providing error logs as attachment. could see error in logs related to Jgroups blocking and connection getting terminated between nodes.
> Let us know your valuable inputs, on what exactly the issue i.e causing the slowness then blocking the whole server.
> Attached cluster configuration for each nodes and error logs
> Adding to this we are getting below error while doing deployment/restarting of servers
> WARN [localhost-startStop-1] [GMS] hybrisnode-0: JOIN(hybrisnode-0) sent to hybrisnode-2 timed out (after 3000 ms), on try 3
> WARN [pool-3-thread-1] [GMS] hybrisnode-3: JOIN(hybrisnode-3) sent to hybrisnode-1 timed out (after 3000 ms), on try 4
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 8 months
[JBoss JIRA] (JGRP-2361) Error related to Jgroup and Database connection is getting reset
by karthikeyan Aruljothi (Jira)
[ https://issues.jboss.org/browse/JGRP-2361?page=com.atlassian.jira.plugin.... ]
karthikeyan Aruljothi updated JGRP-2361:
----------------------------------------
Attachment: Jgroup node configuration.txt
> Error related to Jgroup and Database connection is getting reset
> ----------------------------------------------------------------
>
> Key: JGRP-2361
> URL: https://issues.jboss.org/browse/JGRP-2361
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.6.11
> Environment: Hybris running on tomcat - Centos 7
> Reporter: karthikeyan Aruljothi
> Assignee: Bela Ban
> Priority: Major
> Attachments: Jgroup error in preprod-000.txt, Jgroup node configuration.txt, Jgroups blocking and terminating connection.txt, Jgroups error in console.txt, error Jgroups.txt, jgroups-tcp.xml
>
>
> Hi ,
> we are facing an issue with our cluster configuration and due to this JVM responding time also takes more time, after clearing the cache / restarting all nodes application works as expected.
> When issue arises one of the core occupies 100% cpu utilization then it confirms to restart the server otherwise it never process any request. Below is our configuration in local.properties. Also providing error logs as attachment. could see error in logs related to Jgroups blocking and connection getting terminated between nodes.
> Let us know your valuable inputs, on what exactly the issue i.e causing the slowness then blocking the whole server.
> Attached cluster configuration for each nodes and error logs
> Adding to this we are getting below error while doing deployment/restarting of servers
> WARN [localhost-startStop-1] [GMS] hybrisnode-0: JOIN(hybrisnode-0) sent to hybrisnode-2 timed out (after 3000 ms), on try 3
> WARN [pool-3-thread-1] [GMS] hybrisnode-3: JOIN(hybrisnode-3) sent to hybrisnode-1 timed out (after 3000 ms), on try 4
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 8 months
[JBoss JIRA] (JGRP-2361) Error related to Jgroup and Database connection is getting reset
by karthikeyan Aruljothi (Jira)
[ https://issues.jboss.org/browse/JGRP-2361?page=com.atlassian.jira.plugin.... ]
karthikeyan Aruljothi updated JGRP-2361:
----------------------------------------
Attachment: jgroups-tcp.xml
> Error related to Jgroup and Database connection is getting reset
> ----------------------------------------------------------------
>
> Key: JGRP-2361
> URL: https://issues.jboss.org/browse/JGRP-2361
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 3.6.11
> Environment: Hybris running on tomcat - Centos 7
> Reporter: karthikeyan Aruljothi
> Assignee: Bela Ban
> Priority: Major
> Attachments: Jgroup error in preprod-000.txt, Jgroups blocking and terminating connection.txt, Jgroups error in console.txt, error Jgroups.txt, jgroups-tcp.xml
>
>
> Hi ,
> we are facing an issue with our cluster configuration and due to this JVM responding time also takes more time, after clearing the cache / restarting all nodes application works as expected.
> When issue arises one of the core occupies 100% cpu utilization then it confirms to restart the server otherwise it never process any request. Below is our configuration in local.properties. Also providing error logs as attachment. could see error in logs related to Jgroups blocking and connection getting terminated between nodes.
> Let us know your valuable inputs, on what exactly the issue i.e causing the slowness then blocking the whole server.
> Attached cluster configuration for each nodes and error logs
> Adding to this we are getting below error while doing deployment/restarting of servers
> WARN [localhost-startStop-1] [GMS] hybrisnode-0: JOIN(hybrisnode-0) sent to hybrisnode-2 timed out (after 3000 ms), on try 3
> WARN [pool-3-thread-1] [GMS] hybrisnode-3: JOIN(hybrisnode-3) sent to hybrisnode-1 timed out (after 3000 ms), on try 4
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 8 months
[JBoss JIRA] (JGRP-2234) Unlocked locks stay locked forever
by Bela Ban (Jira)
[ https://issues.jboss.org/browse/JGRP-2234?page=com.atlassian.jira.plugin.... ]
Bela Ban commented on JGRP-2234:
--------------------------------
Nope. But since I do have quite a few lock-related issues in 4.1.2, I'll get to them after a jgroups-raft issue I need to fix first...
> Unlocked locks stay locked forever
> ----------------------------------
>
> Key: JGRP-2234
> URL: https://issues.jboss.org/browse/JGRP-2234
> Project: JGroups
> Issue Type: Bug
> Reporter: Bram Klein Gunnewiek
> Assignee: Bela Ban
> Priority: Major
> Fix For: 4.0.11, 3.6.18
>
> Attachments: ClusterSplitLockTest.java, jg_clusterlock_output_testfail.txt
>
>
> As discussed in the mailing list we have issues where locks from the central lock protocol stay locked forever when the coordinator of the cluster disconnects. We can reproduce this with the attached ClusterSplitLockTest.java. Its a race condition and we need to run the test a lot of times (sometimes > 20) before we encounter a failure.
> What we think is happening:
> In a three node cluster (node A, B and C where node A is the coordinator) unlock requests from B and/or C can be missed when node A leaves and B and/or C don't have the new view installed yet. When, for example, node B takes over coordination it creates the lock table based on the back-ups. Lets say node C has locked the lock with name 'lockX'. Node C performs an unlock of 'lockX' just after node A (gracefully) leaves and sends the unlock request to node A since node C doesn't have the correct view installed yet. Node B has recreated the lock table where 'lockX' is locked by Node C. Node C doesn't resend the unlock request so 'lockX' gets locked forever.
> Attached is the testng test we wrote and the output of a test failure.
--
This message was sent by Atlassian Jira
(v7.12.1#712002)
4 years, 8 months