January 2012 - jboss-jira - Jboss List Archives

[JBoss JIRA] Created: (JGRP-1361) NAKACK: use bigger timeouts for big retransmission tasks

by Bela Ban (JIRA)

NAKACK: use bigger timeouts for big retransmission tasks -------------------------------------------------------- Key: JGRP-1361 URL: https://issues.jboss.org/browse/JGRP-1361 Project: JGroups Issue Type: Enhancement Reporter: Bela Ban Assignee: Bela Ban Fix For: 3.1 Oftentimes we receive messages out of order, e.g. because an OOB message follows a range of regular messages, but since the regular messages are bundled, they ay arrive later than the OOB message. Say the regular messages are [10-30] and the OOB message is 31. Receiving #31 before #10-30 triggers the addition of a retransmission task for [10..30] to the retransmitter. The task usually goes off after some initial delay, say 500ms. If we receive [10..30] before the task goes off, it will be cancelled. If we receive most of the messages in [10..30], only the missing messages will get retransmitted when the task fires. So in most cases, a lot of retransmission tasks will never fire and are cancelled before. However, sometimes we can receive a seqno which is way larger than the currently highest_received seqno, e.g. highest_received=15000, seqno=20000. This means we now add a retransmission request for [15000-20000]. It is likely that the seqno was just received out of order, but it may trigger the (unneeded) retransmission of 5000 messages ! The suggested solution is therefore to increase the initial delay for large retransmission tasks, such that they execute a bit later. Of course, the underlying assumption is that most of the missing messages will arrive before the timeout goes off. If the 5000 message are really lost, e.g. dropped by the IP stack or a switch, then they will need to get retransmitted. If we have an exponential_backoff of 500, the initial delay is 500ms. We could say that a 'large' retransmission task is any task which asks for retransmission of more than 10% of the current retransmission table's size. We could compute an offset to the initial delay, which is added to the delay (only the first time), by using the delta and the current delay. We could add this delta to delay only the first time a retransmission task is scheduled, or we could add this to any retransmission scheduling as long as the task is large. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 8 months

1
1
0 / 0

[JBoss JIRA] Created: (JGRP-1362) NAKACK: second line of defense for requested retransmissions that are not found

by Bela Ban (JIRA)

NAKACK: second line of defense for requested retransmissions that are not found ------------------------------------------------------------------------------- Key: JGRP-1362 URL: https://issues.jboss.org/browse/JGRP-1362 Project: JGroups Issue Type: Enhancement Reporter: Bela Ban Assignee: Bela Ban Fix For: 2.12.2, 3.1 When the original sender B is asked by A to retransmit message M, but doesn't have M in its retransmission table anymore, it should tell A, or else A will send retransmission requests to B until A or B leave. This problem should have been fixed by JGRP-1251, but if it turns out it wasn't, then this JIRA is (1) a second line of defense to stop the endless retransmission requests and (2) will give us valuable diagnostic information to fix the underlying problem (should there still be one). Problem: - A has a NakReceiverWindow (NRW) of 50 (highest_delivered seqno) for B - B's NRW, however, is 200. B garbage collected messages up to 150. - When B sends message 201, A will ask B for retransmission of [51-200] - B will retransmit messages [150-200], but it cannot send messages 51-149, as it doesn't have them anymore ! - A will add messages [150-200], but its NRW is still 50 (highest_delivered) - A will continue asking B for messages [51-149] (it does have [150-201]) - This will go on forever, or until B or A leaves SOLUTION: - When the *original sender* B of message M receives a retransmission request for M (from A), and it doesn't have M in its retransmission table, it should send back a MSG_NOT_FOUND message to A including B's digest - When A receives the MSG_NOT_FOUND message, it does the following: - It logs it own NRW for B - It logs B's digest - It logs its digest history (This information is valuable for investigating the underlying issue) - Then A's NRW for B is adjusted: - The highest_delivered seqno is set to B.digest.highest_delivered - All messages in xmit_table below B.digest.highest_delivered are removed - All retransmission tasks in the retransmitter <= B.digest.highest_delivered are cancelled and removed (This will stop the retransmission) Again, this is a second line of defense, which should never be used. If the underlying problem does occur, however, we'll have valuable information in the logs to diagnose what went wrong. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 8 months

2
3
0 / 0

[JBoss JIRA] (JGRP-1396) Merge NakreceiverWindow and Retransmitter

by Bela Ban (Created) (JIRA)

Merge NakreceiverWindow and Retransmitter ----------------------------------------- Key: JGRP-1396 URL: https://issues.jboss.org/browse/JGRP-1396 Project: JGroups Issue Type: Enhancement Reporter: Bela Ban Assignee: Bela Ban Fix For: 3.2 Both NakReceiverWindow and Retransmitter use their own data structures to keep a list of messages received (NRW) and seqnos to be retransmitted (Retrasmitter). This is redundant and costly memory-wise. I suggest let's merge the 2 classes, or at least let them share the data structure which keeps track of received messages. Suggestion II: create a ring buffer with a (changeable) capacity that keeps track of received messages and messages to be retransmitted. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 8 months

5
17
0 / 0

[JBoss JIRA] (JGRP-1402) NAKACK: too much lock contention between sending and receiving messages

by Bela Ban (Created) (JIRA)

NAKACK: too much lock contention between sending and receiving messages ----------------------------------------------------------------------- Key: JGRP-1402 URL: https://issues.jboss.org/browse/JGRP-1402 Project: JGroups Issue Type: Enhancement Reporter: Bela Ban Assignee: Bela Ban Fix For: 3.2 When we have only 1 node in a cluster, sending and receiving messages creates a lot of contention in NakReceiverWindow (NRW). To reproduce: - Start MPerf - Press '1' to send 1 million messages - The throughput is ca 20-30 MB/sec, compared to 140 MB when running multiple instances of MPerf on the same box ! In the profiler, we can see that the write lock in NRW makes up for ca 99% of all the blocking ! Ca. half is caused by NRW.add(), the other half by NRW.removeMany(). The reason is that, when we send a message, it is added to the NRW (add()). The incoming thread then tries to remove as many messages as possible (removeMany()), and blocks messages being added to NRW by the sender, and vice versa; the removeMany() method is blocked accessing the NRW by many add()s. SOLUTION 1: - If we only have 1 member in the cluster, call removeMany() immediately after NRW.add() on the sender. No need for a message to be processed by the incoming thread pool, if we're the only member in the cluster - The downside here is that we don't reduce the contention on NRW if we have more than 1 member: this lock contention may even slow down the case of more than 1 member clusters ! SOLUTION 2: - Make NRW.add() and remove() more efficient, and contend less on the same lock. - [1] should help. [1] https://issues.jboss.org/browse/JGRP-1396 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 8 months

4
3
0 / 0

[JBoss JIRA] (AS7-2557) Management Console support for JPA

by Scott Marlow (Created) (JIRA)

Management Console support for JPA ---------------------------------- Key: AS7-2557 URL: https://issues.jboss.org/browse/AS7-2557 Project: Application Server 7 Issue Type: Feature Request Components: Console Reporter: Scott Marlow Assignee: Heiko Braun -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 8 months

5
6
0 / 0

[JBoss JIRA] (AS7-3333) Admin console should not require "max threads" value when creating an EJB3 thread pool

by Jan Martiska (JIRA)

Jan Martiska created AS7-3333: --------------------------------- Summary: Admin console should not require "max threads" value when creating an EJB3 thread pool Key: AS7-3333 URL: https://issues.jboss.org/browse/AS7-3333 Project: Application Server 7 Issue Type: Bug Components: Console Affects Versions: 7.1.0.CR1b Reporter: Jan Martiska Assignee: Heiko Braun Priority: Optional Fix For: 7.1.0.Final Attachments: ejb3threadpool.png The help says: "The default value if not configured is the number of processes as returned by Runtime.availableProcessors()", therefore a specific number should not be required from the user. See screenshot. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 8 months

1
3
0 / 0

[JBoss JIRA] (AS7-3410) CLONE - Classloading issue with HornetQLoggerFactory - problem with failback

by Miroslav Novak (JIRA)

Miroslav Novak created AS7-3410: ----------------------------------- Summary: CLONE - Classloading issue with HornetQLoggerFactory - problem with failback Key: AS7-3410 URL: https://issues.jboss.org/browse/AS7-3410 Project: Application Server 7 Issue Type: Bug Components: JMS Affects Versions: 7.1.0.CR1 Reporter: Miroslav Novak Assignee: Andy Taylor Priority: Critical Fix For: 7.1.0.Final Attachments: console-log-backup-server.txt, console-log-live-server.txt, reproducer.zip Test scenario: 1. Start two servers AS7/EAP6 servers - live and its backup in dedicated topology - each on different machine 2. Kill live server using "kill -9 ..." 3. Start live server again In step 3. there are unexpected messages in console log of live and backup server. >From backup server: {code} 12:07:26,165 INFO [org.hornetq.core.server.impl.HornetQServerImpl] (Thread-78) HornetQ Server version 2.2.7.Final (HQ_2_2_7_FINAL_AS7, 121) [17700d86-45b2-11e1-a575-d48564b8e1e7] stopped 12:07:26,165 INFO [org.hornetq.core.server.impl.HornetQServerImpl] (Thread-78) unable to restart server, please kill and restart manually: java.lang.IllegalArgumentException: Could not find class org.jboss.as.messaging.HornetQLoggerFactory at org.hornetq.utils.ClassloadingUtil$1.run(ClassloadingUtil.java:42) [hornetq-core-2.2.7.Final.jar:] at java.security.AccessController.doPrivileged(Native Method) [:1.6.0_22] at org.hornetq.utils.ClassloadingUtil.safeInitNewInstance(ClassloadingUtil.java:16) [hornetq-core-2.2.7.Final.jar:] at org.hornetq.core.server.impl.HornetQServerImpl.instantiateInstance(HornetQServerImpl.java:1868) [hornetq-core-2.2.7.Final.jar:] at org.hornetq.core.server.impl.HornetQServerImpl.initialiseLogging(HornetQServerImpl.java:1301) [hornetq-core-2.2.7.Final.jar:] at org.hornetq.core.server.impl.HornetQServerImpl.start(HornetQServerImpl.java:541) [hornetq-core-2.2.7.Final.jar:] at org.hornetq.core.server.impl.HornetQServerImpl$SharedStoreBackupActivation$1FailbackChecker$1.run(HornetQServerImpl.java:430) [hornetq-core-2.2.7.Final.jar:] at java.lang.Thread.run(Thread.java:679) [:1.6.0_22] {code} >From live server: {code} 12:07:51,993 INFO [org.jboss.as.messaging] (MSC service thread 1-3) JBAS011601: Bound messaging object to jndi name java:/topic/test 12:07:52,007 INFO [org.jboss.as] (Controller Boot Thread) JBoss EAP 6.0.0.Alpha2 (AS 7.1.0.CR1-redhat-1) started in 31092ms - Started 155 of 263 services (103 services are passive or on-demand) 12:07:53,292 DEBUG [org.hornetq.core.client.impl.ClientSessionFactoryImpl] (Thread-1 (group:HornetQ-client-global-threads-1954829789)) Trying reconnection attempt 1 12:07:53,292 DEBUG [org.hornetq.core.remoting.impl.netty.NettyConnector] (Thread-1 (group:HornetQ-client-global-threads-1954829789)) Started Netty Connector version 3.2.3.Final-r${buildNumber} 12:07:53,292 DEBUG [org.hornetq.core.client.impl.ClientSessionFactoryImpl] (Thread-1 (group:HornetQ-client-global-threads-1954829789)) Trying to connect at the main server using connector :org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5446&host=192-168-10-4 12:07:53,293 DEBUG [org.hornetq.core.client.impl.ClientSessionFactoryImpl] (Thread-1 (group:HornetQ-client-global-threads-1954829789)) Main server is not up. Hopefully there's a backup configured now! 12:07:55,293 DEBUG [org.hornetq.core.client.impl.ClientSessionFactoryImpl] (Thread-1 (group:HornetQ-client-global-threads-1954829789)) Trying reconnection attempt 2 12:07:55,293 DEBUG [org.hornetq.core.remoting.impl.netty.NettyConnector] (Thread-1 (group:HornetQ-client-global-threads-1954829789)) Started Netty Connector version 3.2.3.Final-r${buildNumber} 12:07:55,293 DEBUG [org.hornetq.core.client.impl.ClientSessionFactoryImpl] (Thread-1 (group:HornetQ-client-global-threads-1954829789)) Trying to connect at the main server using connector :org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5446&host=192-168-10-4 12:07:55,294 DEBUG [org.hornetq.core.client.impl.ClientSessionFactoryImpl] (Thread-1 (group:HornetQ-client-global-threads-1954829789)) Main server is not up. Hopefully there's a backup {code} I'm not fully sure but it looks like that backup server did not manage to get to "waiting for live to fail" state and live server is not able to detect backup. There is set "DEBUG" level logging for "org.hornetq" in attached logs. I've attached reproducer.zip - steps to use: 1. Download and unzip "reproducer.zip" 2. Prepare live and backup server - "sh prepare.sh" 3. Start live - "sh start-server1.sh server1_hostname" 4. Start backup - "sh start-server2.sh server2_hostname" 5. Kill live server using "kill -9 server1_process_id" 6. Start live server again - "sh start-server1.sh server1_hostname" Note: In reproducer.zip are configuration files standalone-ha-A.xml, standalone-ha-B.xml (A for live, B for backup). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 8 months

1
0
0 / 0

[JBoss JIRA] (AS7-3368) jboss-admin tool doesn't return non zero code when command fails to execute

by Rostislav Svoboda (JIRA)

Rostislav Svoboda created AS7-3368: -------------------------------------- Summary: jboss-admin tool doesn't return non zero code when command fails to execute Key: AS7-3368 URL: https://issues.jboss.org/browse/AS7-3368 Project: Application Server 7 Issue Type: Feature Request Components: Domain Management, Scripts Affects Versions: 7.1.0.CR1b Reporter: Rostislav Svoboda Assignee: Brian Stansberry Priority: Critical Fix For: 7.1.0.Final Jboss-admin tool doesn't return non zero code when command fails to execute. It's necessary for us to know if command was executed properly or not. Parsing output is unacceptable. {code} [rsvoboda@rosta-ntb ~]$ TESTING/jboss-as7/bin/jboss-admin.sh --connect command=:shutdown The controller is not available at localhost:9999 You are disconnected at the moment. Type 'connect' to connect to the server or 'help' for the list of supported commands. [rsvoboda@rosta-ntb ~]$ echo $? 0 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 8 months

3
2
0 / 0

[JBoss JIRA] (AS7-3400) Description for replace-deployment shows deploy

by Erhard Siegl (JIRA)

Erhard Siegl created AS7-3400: --------------------------------- Summary: Description for replace-deployment shows deploy Key: AS7-3400 URL: https://issues.jboss.org/browse/AS7-3400 Project: Application Server 7 Issue Type: Bug Components: ConfigAdmin Affects Versions: 7.1.0.CR1b Reporter: Erhard Siegl Assignee: Thomas Diesler In jboss-admin: [domain@localhost:9999 server-group=other-server-group] :read-operation-description(name="replace-deployment") { "outcome" => "success", "result" => { "operation-name" => "deploy", "description" => "Deploy the specified deployment content into the runtime, optionally replacing existing content.", "reply-properties" => {}, "read-only" => false } } This is the description for the "deploy" operation and not for "replace-deployment". -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 8 months

2
1
0 / 0

[JBoss JIRA] (AS7-3210) Apache Tomcat SingleSignonValve not being invoked

by Luis Barreiro (Created) (JIRA)

Apache Tomcat SingleSignonValve not being invoked ------------------------------------------------- Key: AS7-3210 URL: https://issues.jboss.org/browse/AS7-3210 Project: Application Server 7 Issue Type: Bug Components: Security Reporter: Luis Barreiro Assignee: Anil Saldhana Despite having org.apache.catalina.authenticator.SingleSignOn valve defined in jboss-web.xml and <sso> element present in virtual-host configuration, there is no invocation of the valve, neither a JSESSIONIDSSO cookie is present in responses. The valve is present in the pipeline, but the following line appears in the server log: [org.apache.catalina.authenticator.AuthenticatorBase] (MSC service thread 1-2) No SingleSignOn Valve is present Testcase available at https://github.com/barreiro/jboss-as/tree/JBQA-5281/testsuite/integration... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira

12 years, 8 months

6
6
0 / 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

jboss-jira January 2012