[JBoss JIRA] (JGRP-2402) SOS report
by Bela Ban (Jira)
[ https://issues.redhat.com/browse/JGRP-2402?page=com.atlassian.jira.plugin... ]
Bela Ban updated JGRP-2402:
---------------------------
Description:
Add the ability to dump the most important attributes to a file / log (configurable), e.g.:
* Max threads
* Number of rejected messages
* Size of retransmission tables (NAKACK2, UNICAST3)
This should be done periodically (e.g. every 15 minutes). This log should be enabled by default, e.g. in Infinispan.
These files can then be sent to support via an SOS report.
This can be used to diagnose issues by telling the customer to invoke this command and attach the resulting file to a ticket.
was:
Add the ability to dump the most important attributes to a file / log (configurable), e.g.:
* Max threads
* Number of rejected messages
* Size of retransmission tables (NAKACK2, UNICAST3)
This can be used to diagnose issues by telling the customer to invoke this command and attach the resulting file to a ticket
> SOS report
> ----------
>
> Key: JGRP-2402
> URL: https://issues.redhat.com/browse/JGRP-2402
> Project: JGroups
> Issue Type: Feature Request
> Reporter: Bela Ban
> Assignee: Bela Ban
> Priority: Minor
> Fix For: 4.2.2, 5.0.0.Alpha4
>
>
> Add the ability to dump the most important attributes to a file / log (configurable), e.g.:
> * Max threads
> * Number of rejected messages
> * Size of retransmission tables (NAKACK2, UNICAST3)
> This should be done periodically (e.g. every 15 minutes). This log should be enabled by default, e.g. in Infinispan.
> These files can then be sent to support via an SOS report.
> This can be used to diagnose issues by telling the customer to invoke this command and attach the resulting file to a ticket.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 1 month
[JBoss JIRA] (JGRP-2403) Dump information in panic scenarios
by Bela Ban (Jira)
[ https://issues.redhat.com/browse/JGRP-2403?page=com.atlassian.jira.plugin... ]
Bela Ban updated JGRP-2403:
---------------------------
Description:
When there is a panic situation (e.g. thread pool exhausted), dump information (e.g. including a thread dump) to a file at FATAL level.
was:
When there is a panic situation (e.g. thread pool exhausted), dump information (e.g. including a thread dump) to a file at FATAL level.
Also dump the most important information to another log (file) periodically (e.g. every 15 minutes). This log should be enabled by default, e.g. in Infinispan.
These files can be sent to support via an SOS report.
> Dump information in panic scenarios
> -----------------------------------
>
> Key: JGRP-2403
> URL: https://issues.redhat.com/browse/JGRP-2403
> Project: JGroups
> Issue Type: Feature Request
> Reporter: Bela Ban
> Assignee: Bela Ban
> Priority: Minor
> Fix For: 4.2.2, 5.0.0.Alpha4
>
>
> When there is a panic situation (e.g. thread pool exhausted), dump information (e.g. including a thread dump) to a file at FATAL level.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 1 month
[JBoss JIRA] (WFLY-11442) Remove unused dependencies from org.jboss.as.ejb3
by Ranabir Chakraborty (Jira)
[ https://issues.redhat.com/browse/WFLY-11442?page=com.atlassian.jira.plugi... ]
Ranabir Chakraborty reassigned WFLY-11442:
------------------------------------------
Assignee: Ranabir Chakraborty (was: Yeray Borges Santana)
> Remove unused dependencies from org.jboss.as.ejb3
> -------------------------------------------------
>
> Key: WFLY-11442
> URL: https://issues.redhat.com/browse/WFLY-11442
> Project: WildFly
> Issue Type: Bug
> Components: Server
> Reporter: Yeray Borges Santana
> Assignee: Ranabir Chakraborty
> Priority: Major
>
> Initial analisys checking only first level dependencies from the resource exposed by {{org.jboss.as.ejb3}} shows that these dependencies are being unused:
> * org.jboss.jts
> * org.wildfly.security.elytron-web.undertow-server
> * org.jboss.as.weld
> * org.wildfly.clustering.marshalling.spi
> * org.wildfly.clustering.marshalling.api
> * org.wildfly.client.config
> * org.hibernate
> The task here is verify that they are not used by any other machanism besides of being a first level dependency.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 1 month
[JBoss JIRA] (JGRP-2463) TransferQueueBundler: Message to stopped node blocks the bundler thread
by Bela Ban (Jira)
[ https://issues.redhat.com/browse/JGRP-2463?page=com.atlassian.jira.plugin... ]
Bela Ban commented on JGRP-2463:
--------------------------------
{quote}
I now have another theory: each TransferQueueBundler.run() iteration drains the entire contents of the queue into remove_queue, then tries to send the messages one by one. If there's an exception (e.g. java.net.ConnectException) sending any of those messages, it's only caught at the end of the iteration, and the next iteration drops all the unsent messages with removed_queue.clear().
{quote}
No, that's not true: {{sendBundledMessages()}} will never throw an exception, as {{sendSingleMessage()}} and {{sendMessageList()}} catch (and log) all exceptions.
> TransferQueueBundler: Message to stopped node blocks the bundler thread
> -----------------------------------------------------------------------
>
> Key: JGRP-2463
> URL: https://issues.redhat.com/browse/JGRP-2463
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.2.1
> Reporter: Dan Berindei
> Assignee: Bela Ban
> Priority: Major
> Fix For: 4.2.2, 5.0.0.Alpha4
>
>
> {{TransferQueueBundler}} sends all the messages from a single thread. When one of the {{TP.doSend()}} calls blocks, the bundler thread no longer makes any progress, and it doesn't send messages to any destination, even if {{TP.doSend()}} is only slow for one particular destination.
> One example is when sending a message to a stopped node, e.g. the coordinator sending a {{LEAVE_RSP}} after the leaver has already stopped. The bundler thread calls {{TP.doSend()}}, the connection no longer exists, so it ends up calling {{BaseServer.createConnection()}}. If the stopped node's machine is no longer up or it is configured to drop messages to closed ports, the connection open blocks the bundler thread for {{TCP.sock_conn_timeout}}(default: 2s).
> {{UNICAST3}} also retransmits the highest sent message every {{UNICAST3.xmit_interval}} (default: 500ms), for {{UNICAST3.max_retransmit_time}}(default: 1 min), so the bundler thread will block more than once for the same message.
> I assume the bundler thread will also block if the transport is {{TCP}}, one of the destinations is overloaded, and the TCP connection's send buffer is full. Normally applications try to spread the workload evenly among members, but e.g. with RELAY2 not all the members will be site masters.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 1 month
[JBoss JIRA] (JGRP-2463) TransferQueueBundler: Message to stopped node blocks the bundler thread
by Dan Berindei (Jira)
[ https://issues.redhat.com/browse/JGRP-2463?page=com.atlassian.jira.plugin... ]
Dan Berindei commented on JGRP-2463:
------------------------------------
> The log snippet below shows that the connection attempt to the left member takes 4ms, so this should not be an issue:
Oops, I should have checked the logs! I was convinced there was no ICMP error message in their test because they're killing the container, not just the server process.
I now have another theory: each {{TransferQueueBundler.run()}} iteration drains the entire contents of the queue into {{remove_queue}}, then tries to send the messages one by one. If there's an exception (e.g. {{java.net.ConnectException}}) sending any of those messages, it's only caught at the end of the iteration, and the next iteration drops all the unsent messages with {{removed_queue.clear()}}.
Since {{UNICAST3}} resends the last message to the missing node every {{UNICAST3.xmit_interval}} ms, some messages could be dropped more than once, leading to total latencies much higher than {{UNICAST3.xmit_interval}}.
> Can this be reproduced?
I assume the failure can be reproduced by the KeyCloak team, although they haven't added any more comments to KEYCLOAK-13310
> We could experiment with a bundler that has 1 queue for destination (and 1 associated thread dequeuing), and RED dropping messages before/when the queue gets full. However, this is too complicated a change...
That's what I had in mind, in fact adding a comment to JGRP-2462 was my main motivation to open this JIRA :)
> I think we should use TCP_NIO2 for scenarios in which TCP writes can block. I guess I should move JGRP-2108 up... wdyt?
+100
> TransferQueueBundler: Message to stopped node blocks the bundler thread
> -----------------------------------------------------------------------
>
> Key: JGRP-2463
> URL: https://issues.redhat.com/browse/JGRP-2463
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.2.1
> Reporter: Dan Berindei
> Assignee: Bela Ban
> Priority: Major
> Fix For: 4.2.2, 5.0.0.Alpha4
>
>
> {{TransferQueueBundler}} sends all the messages from a single thread. When one of the {{TP.doSend()}} calls blocks, the bundler thread no longer makes any progress, and it doesn't send messages to any destination, even if {{TP.doSend()}} is only slow for one particular destination.
> One example is when sending a message to a stopped node, e.g. the coordinator sending a {{LEAVE_RSP}} after the leaver has already stopped. The bundler thread calls {{TP.doSend()}}, the connection no longer exists, so it ends up calling {{BaseServer.createConnection()}}. If the stopped node's machine is no longer up or it is configured to drop messages to closed ports, the connection open blocks the bundler thread for {{TCP.sock_conn_timeout}}(default: 2s).
> {{UNICAST3}} also retransmits the highest sent message every {{UNICAST3.xmit_interval}} (default: 500ms), for {{UNICAST3.max_retransmit_time}}(default: 1 min), so the bundler thread will block more than once for the same message.
> I assume the bundler thread will also block if the transport is {{TCP}}, one of the destinations is overloaded, and the TCP connection's send buffer is full. Normally applications try to spread the workload evenly among members, but e.g. with RELAY2 not all the members will be site masters.
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 1 month
[JBoss JIRA] (SWSQE-1107) Update prepare-e2e-tests-env.sh and prepare-ui-tests-env.sh to be less verbose
by Filip Brychta (Jira)
Filip Brychta created SWSQE-1107:
------------------------------------
Summary: Update prepare-e2e-tests-env.sh and prepare-ui-tests-env.sh to be less verbose
Key: SWSQE-1107
URL: https://issues.redhat.com/browse/SWSQE-1107
Project: Kiali QE
Issue Type: QE Task
Reporter: Filip Brychta
Assignee: Filip Brychta
It produces very long output containing lot of:
Cloning into '/home/jenkins/agent/workspace/run-kiali-e2e-tests/kiali'...
Checking out files: 42% (3633/8645)
Checking out files: 43% (3718/8645)
Checking out files: 44% (3804/8645)
Checking out files: 45% (3891/8645)
Checking out files: 46% (3977/8645)
Checking out files: 47% (4064/8645)
Checking out files: 48% (4150/8645)
From https://github.com/kiali/kiali
* [new ref] refs/pull/1/head -> origin/pr/1/head
* [new ref] refs/pull/1/merge -> origin/pr/1/merge
* [new ref] refs/pull/10/head -> origin/pr/10/head
* [new ref] refs/pull/10/merge -> origin/pr/10/merge
* [new ref] refs/pull/100/head -> origin/pr/100/head
* [new ref] refs/pull/1000/head -> origin/pr/1000/head
* [new ref] refs/pull/1001/head -> origin/pr/1001/head
* [new ref] refs/pull/1002/head -> origin/pr/1002/head
* [new ref] refs/pull/1003/head -> origin/pr/1003/head
* [new ref] refs/pull/1004/head -> origin/pr/1004/head
* [new ref] refs/pull/1005/head -> origin/pr/1005/head
* [new ref] refs/pull/1009/head -> origin/pr/1009/head
* [new ref] refs/pull/101/head -> origin/pr/101/head
* [new ref] refs/pull/1010/head -> origin/pr/1010/head
* [new ref] refs/pull/1011/head -> origin/pr/1011/head
* [new ref] refs/pull/1012/head -> origin/pr/1012/head
* [new ref] refs/pull/1013/head -> origin/pr/1013/head
* [new ref] refs/pull/1014/head -> origin/pr/1014/head
* [new ref] refs/pull/1015/head -> origin/pr/1015/head
* [new ref] refs/pull/1016/head -> origin/pr/1016/head
--
This message was sent by Atlassian Jira
(v7.13.8#713008)
6 years, 1 month