]
Jeff Mesnil updated WFLY-8632:
------------------------------
Summary: Add (was: Artemis doesn't handle JDBC network problems)
Add
---
Key: WFLY-8632
URL:
https://issues.jboss.org/browse/WFLY-8632
Project: WildFly
Issue Type: Bug
Components: JMS
Reporter: Jeff Mesnil
Assignee: Martyn Taylor
Priority: Critical
Labels: KK-DR17, eap7.1-rfe-failure
If the network goes down between Artemis and DB, the Artemis should behave in the same
way as in case that journal storage is used and underlying network file system is
disconnected. It should throw an critical IO error and stop itself.
Currently if network is down, JDBC calls hang until OS tcp timeout expires (typically 10
minutes). It contradicts fail fast pattern.
This behavior can be changed by setting networkTimeout \[1\] property to non zero value.
I think this timeout should be configurable and default value should be less than 30
seconds what is default timeout for client's blocking operations.
If JDBC connection is closed from any reason (expiration of tcp timeout or
networkTimeout), Artemis should throw critical IO error and stop itself.
Currently even if JDBC connection is closed, Artemis tries to execute DB operations on it
what causes throwing of exceptions. Artemis is not able to recover from this state and it
must be restarted.
*Customer impact:* If the network goes down between Artemis and DB, there is no error in
server log for 10 minutes. During this time clients are blocked without any explanatory
exception. It contradicts fail fast pattern and is difficult to find out what is wrong.
If JDBC connection is closed after 10 minutes, clients are still successfully connected
to Artemis but they get exception for all operations. Since their connections are still
active, they don't reconnect to other Artemis instance.
\[1\]
https://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#setNet...