[JBoss JIRA] (JGRP-2075) SYM/ASYM_ENCRYPT: don't use WeakHashMap for old ciphers
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/JGRP-2075?page=com.atlassian.jira.plugin.... ]
Bela Ban updated JGRP-2075:
---------------------------
Description:
Currently we use WeakHashMap, but should not, reasons outlined below. We could replace it with a LazyRemovalCache.
Andrew Haley's email:
{quote}
TL/DR: Please don't use WeakReferences, SoftReferences, etc. to cache
any data which might point to native memory. In particular, never do
this with instances of java.security.Key. Instead, implement either
some kind of ageing strategy or a fixed-size cache.
...
This is a warning to anybody who might cache crypto keys.
A customer has been having problems with the exhaustion of native
memory before the Java heap is full. It was fun trying to track down
the cause, but it's now happened several times to several customers,
and it's a serious problem for real-world usage in app servers.
PKCS#11 is a standard way to communicate between applications and
crypto libraries. There is a Java crypto provider which supports
PKCS#11. Some of our customers must use this provider in order to get
FIPS certification.
The problem is this:
A crypto key is a buffer in memory, allocated by the PKCS#11 native
library. It's accessed via a handle which is stored as an integer
field in a Java object. This Java object is a PhantomReference, so
when the garbage collector detects that a crypto key is no longer
reachable it is closed and the associated native memory is freed.
Modern garbage collectors don't much bother to process objects in the
old generation because it's not usually worthwhile. Thus, crypto keys
don't get recycled very quickly. They can pile up in the old
generation. This isn't a problem for the Java heap because the
objects containing the references to crypto keys are very small.
Unfortunately, the native side of a crypto key is much bigger, maybe
up to a thousand times bigger. So if we have 4000 stale crypto keys
in the heap that's not a problem, a few kbytes. But the native memory
may be a megabyte.
This problem is made even worse by Tomcat because it uses
SoftReferences to cache crypto keys. SoftReferences are processed
lazily, and maybe not at all until the Java heap runs out of memory.
Unfortunately it doesn't, but the machine runs out of native memory
instead.
We could solve this simply by making instances of PKCS#11 keys really
big Java objects by padding with dummy fields. Then, the GC would
collect them quickly. This does work but it seriously impacts
performance. Also, we could tweak the garbage collectors to clear out
stale references more enthusiastically, but this impacts performance
even more. There are some controls with the G1 collector which
process SoftReferences more aggressively and these help, but again at
the cost of performance.
Finally: the Shanandoah collector we're working on handles this
problem much better than the older collectors, but it's some
way off.
{quote}
was:
Currently we use WeakHashMap, but should not, reasons outlined below. We could replace it with a LazyRemovalCache.
Andrew Haley's email:
{quote}
TL/DR: Please don't use WeakReferences, SoftReferences, etc. to cache
any data which might point to native memory. In particular, never do
this with instances of java.security.Key. Instead, implement either
some kind of ageing strategy or a fixed-size cache.
{quote}
> SYM/ASYM_ENCRYPT: don't use WeakHashMap for old ciphers
> -------------------------------------------------------
>
> Key: JGRP-2075
> URL: https://issues.jboss.org/browse/JGRP-2075
> Project: JGroups
> Issue Type: Task
> Reporter: Bela Ban
> Assignee: Bela Ban
> Priority: Minor
> Fix For: 3.6.10, 4.0
>
>
> Currently we use WeakHashMap, but should not, reasons outlined below. We could replace it with a LazyRemovalCache.
> Andrew Haley's email:
> {quote}
> TL/DR: Please don't use WeakReferences, SoftReferences, etc. to cache
> any data which might point to native memory. In particular, never do
> this with instances of java.security.Key. Instead, implement either
> some kind of ageing strategy or a fixed-size cache.
> ...
> This is a warning to anybody who might cache crypto keys.
> A customer has been having problems with the exhaustion of native
> memory before the Java heap is full. It was fun trying to track down
> the cause, but it's now happened several times to several customers,
> and it's a serious problem for real-world usage in app servers.
> PKCS#11 is a standard way to communicate between applications and
> crypto libraries. There is a Java crypto provider which supports
> PKCS#11. Some of our customers must use this provider in order to get
> FIPS certification.
> The problem is this:
> A crypto key is a buffer in memory, allocated by the PKCS#11 native
> library. It's accessed via a handle which is stored as an integer
> field in a Java object. This Java object is a PhantomReference, so
> when the garbage collector detects that a crypto key is no longer
> reachable it is closed and the associated native memory is freed.
> Modern garbage collectors don't much bother to process objects in the
> old generation because it's not usually worthwhile. Thus, crypto keys
> don't get recycled very quickly. They can pile up in the old
> generation. This isn't a problem for the Java heap because the
> objects containing the references to crypto keys are very small.
> Unfortunately, the native side of a crypto key is much bigger, maybe
> up to a thousand times bigger. So if we have 4000 stale crypto keys
> in the heap that's not a problem, a few kbytes. But the native memory
> may be a megabyte.
> This problem is made even worse by Tomcat because it uses
> SoftReferences to cache crypto keys. SoftReferences are processed
> lazily, and maybe not at all until the Java heap runs out of memory.
> Unfortunately it doesn't, but the machine runs out of native memory
> instead.
> We could solve this simply by making instances of PKCS#11 keys really
> big Java objects by padding with dummy fields. Then, the GC would
> collect them quickly. This does work but it seriously impacts
> performance. Also, we could tweak the garbage collectors to clear out
> stale references more enthusiastically, but this impacts performance
> even more. There are some controls with the G1 collector which
> process SoftReferences more aggressively and these help, but again at
> the cost of performance.
> Finally: the Shanandoah collector we're working on handles this
> problem much better than the older collectors, but it's some
> way off.
> {quote}
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (JGRP-2075) SYM/ASYM_ENCRYPT: don't use WeakHashMap for old ciphers
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/JGRP-2075?page=com.atlassian.jira.plugin.... ]
Bela Ban updated JGRP-2075:
---------------------------
Description:
Currently we use WeakHashMap, but should not, reasons outlined below. We could replace it with a LazyRemovalCache. Andrew's email refers to SecretKeys but this probably also applies to Ciphers.
Andrew Haley's email:
{quote}
TL/DR: Please don't use WeakReferences, SoftReferences, etc. to cache
any data which might point to native memory. In particular, never do
this with instances of java.security.Key. Instead, implement either
some kind of ageing strategy or a fixed-size cache.
...
This is a warning to anybody who might cache crypto keys.
A customer has been having problems with the exhaustion of native
memory before the Java heap is full. It was fun trying to track down
the cause, but it's now happened several times to several customers,
and it's a serious problem for real-world usage in app servers.
PKCS#11 is a standard way to communicate between applications and
crypto libraries. There is a Java crypto provider which supports
PKCS#11. Some of our customers must use this provider in order to get
FIPS certification.
The problem is this:
A crypto key is a buffer in memory, allocated by the PKCS#11 native
library. It's accessed via a handle which is stored as an integer
field in a Java object. This Java object is a PhantomReference, so
when the garbage collector detects that a crypto key is no longer
reachable it is closed and the associated native memory is freed.
Modern garbage collectors don't much bother to process objects in the
old generation because it's not usually worthwhile. Thus, crypto keys
don't get recycled very quickly. They can pile up in the old
generation. This isn't a problem for the Java heap because the
objects containing the references to crypto keys are very small.
Unfortunately, the native side of a crypto key is much bigger, maybe
up to a thousand times bigger. So if we have 4000 stale crypto keys
in the heap that's not a problem, a few kbytes. But the native memory
may be a megabyte.
This problem is made even worse by Tomcat because it uses
SoftReferences to cache crypto keys. SoftReferences are processed
lazily, and maybe not at all until the Java heap runs out of memory.
Unfortunately it doesn't, but the machine runs out of native memory
instead.
We could solve this simply by making instances of PKCS#11 keys really
big Java objects by padding with dummy fields. Then, the GC would
collect them quickly. This does work but it seriously impacts
performance. Also, we could tweak the garbage collectors to clear out
stale references more enthusiastically, but this impacts performance
even more. There are some controls with the G1 collector which
process SoftReferences more aggressively and these help, but again at
the cost of performance.
Finally: the Shanandoah collector we're working on handles this
problem much better than the older collectors, but it's some
way off.
{quote}
was:
Currently we use WeakHashMap, but should not, reasons outlined below. We could replace it with a LazyRemovalCache.
Andrew Haley's email:
{quote}
TL/DR: Please don't use WeakReferences, SoftReferences, etc. to cache
any data which might point to native memory. In particular, never do
this with instances of java.security.Key. Instead, implement either
some kind of ageing strategy or a fixed-size cache.
...
This is a warning to anybody who might cache crypto keys.
A customer has been having problems with the exhaustion of native
memory before the Java heap is full. It was fun trying to track down
the cause, but it's now happened several times to several customers,
and it's a serious problem for real-world usage in app servers.
PKCS#11 is a standard way to communicate between applications and
crypto libraries. There is a Java crypto provider which supports
PKCS#11. Some of our customers must use this provider in order to get
FIPS certification.
The problem is this:
A crypto key is a buffer in memory, allocated by the PKCS#11 native
library. It's accessed via a handle which is stored as an integer
field in a Java object. This Java object is a PhantomReference, so
when the garbage collector detects that a crypto key is no longer
reachable it is closed and the associated native memory is freed.
Modern garbage collectors don't much bother to process objects in the
old generation because it's not usually worthwhile. Thus, crypto keys
don't get recycled very quickly. They can pile up in the old
generation. This isn't a problem for the Java heap because the
objects containing the references to crypto keys are very small.
Unfortunately, the native side of a crypto key is much bigger, maybe
up to a thousand times bigger. So if we have 4000 stale crypto keys
in the heap that's not a problem, a few kbytes. But the native memory
may be a megabyte.
This problem is made even worse by Tomcat because it uses
SoftReferences to cache crypto keys. SoftReferences are processed
lazily, and maybe not at all until the Java heap runs out of memory.
Unfortunately it doesn't, but the machine runs out of native memory
instead.
We could solve this simply by making instances of PKCS#11 keys really
big Java objects by padding with dummy fields. Then, the GC would
collect them quickly. This does work but it seriously impacts
performance. Also, we could tweak the garbage collectors to clear out
stale references more enthusiastically, but this impacts performance
even more. There are some controls with the G1 collector which
process SoftReferences more aggressively and these help, but again at
the cost of performance.
Finally: the Shanandoah collector we're working on handles this
problem much better than the older collectors, but it's some
way off.
{quote}
> SYM/ASYM_ENCRYPT: don't use WeakHashMap for old ciphers
> -------------------------------------------------------
>
> Key: JGRP-2075
> URL: https://issues.jboss.org/browse/JGRP-2075
> Project: JGroups
> Issue Type: Task
> Reporter: Bela Ban
> Assignee: Bela Ban
> Priority: Minor
> Fix For: 3.6.10, 4.0
>
>
> Currently we use WeakHashMap, but should not, reasons outlined below. We could replace it with a LazyRemovalCache. Andrew's email refers to SecretKeys but this probably also applies to Ciphers.
> Andrew Haley's email:
> {quote}
> TL/DR: Please don't use WeakReferences, SoftReferences, etc. to cache
> any data which might point to native memory. In particular, never do
> this with instances of java.security.Key. Instead, implement either
> some kind of ageing strategy or a fixed-size cache.
> ...
> This is a warning to anybody who might cache crypto keys.
> A customer has been having problems with the exhaustion of native
> memory before the Java heap is full. It was fun trying to track down
> the cause, but it's now happened several times to several customers,
> and it's a serious problem for real-world usage in app servers.
> PKCS#11 is a standard way to communicate between applications and
> crypto libraries. There is a Java crypto provider which supports
> PKCS#11. Some of our customers must use this provider in order to get
> FIPS certification.
> The problem is this:
> A crypto key is a buffer in memory, allocated by the PKCS#11 native
> library. It's accessed via a handle which is stored as an integer
> field in a Java object. This Java object is a PhantomReference, so
> when the garbage collector detects that a crypto key is no longer
> reachable it is closed and the associated native memory is freed.
> Modern garbage collectors don't much bother to process objects in the
> old generation because it's not usually worthwhile. Thus, crypto keys
> don't get recycled very quickly. They can pile up in the old
> generation. This isn't a problem for the Java heap because the
> objects containing the references to crypto keys are very small.
> Unfortunately, the native side of a crypto key is much bigger, maybe
> up to a thousand times bigger. So if we have 4000 stale crypto keys
> in the heap that's not a problem, a few kbytes. But the native memory
> may be a megabyte.
> This problem is made even worse by Tomcat because it uses
> SoftReferences to cache crypto keys. SoftReferences are processed
> lazily, and maybe not at all until the Java heap runs out of memory.
> Unfortunately it doesn't, but the machine runs out of native memory
> instead.
> We could solve this simply by making instances of PKCS#11 keys really
> big Java objects by padding with dummy fields. Then, the GC would
> collect them quickly. This does work but it seriously impacts
> performance. Also, we could tweak the garbage collectors to clear out
> stale references more enthusiastically, but this impacts performance
> even more. There are some controls with the G1 collector which
> process SoftReferences more aggressively and these help, but again at
> the cost of performance.
> Finally: the Shanandoah collector we're working on handles this
> problem much better than the older collectors, but it's some
> way off.
> {quote}
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (JGRP-2075) SYM/ASYM_ENCRYPT: don't use WeakHashMap for old ciphers
by Bela Ban (JIRA)
Bela Ban created JGRP-2075:
------------------------------
Summary: SYM/ASYM_ENCRYPT: don't use WeakHashMap for old ciphers
Key: JGRP-2075
URL: https://issues.jboss.org/browse/JGRP-2075
Project: JGroups
Issue Type: Task
Reporter: Bela Ban
Assignee: Bela Ban
Priority: Minor
Fix For: 3.6.10, 4.0
Currently we use WeakHashMap, but should not, reasons outlined below. We could replace it with a LazyRemovalCache.
Andrew Haley's email:
{quote}
TL/DR: Please don't use WeakReferences, SoftReferences, etc. to cache
any data which might point to native memory. In particular, never do
this with instances of java.security.Key. Instead, implement either
some kind of ageing strategy or a fixed-size cache.
{quote}
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (WFLY-6672) EJB's async methods' Future#get should treat zero timeout as "don't wait"
by Stuart Douglas (JIRA)
[ https://issues.jboss.org/browse/WFLY-6672?page=com.atlassian.jira.plugin.... ]
Stuart Douglas reassigned WFLY-6672:
------------------------------------
Assignee: Stuart Douglas
> EJB's async methods' Future#get should treat zero timeout as "don't wait"
> -------------------------------------------------------------------------
>
> Key: WFLY-6672
> URL: https://issues.jboss.org/browse/WFLY-6672
> Project: WildFly
> Issue Type: Bug
> Components: EJB
> Affects Versions: 8.2.0.Final
> Reporter: Vsevolod Golovanov
> Assignee: Stuart Douglas
>
> Say there is an asynchronous EJB method:
> {code}
> @Asynchronous
> public Future<Boolean> method() {
> ...
> }
> {code}
> Calling Future#get with a zero timeout:
> {code}
> if (bean.method().get(0, TimeUnit.MILLISECONDS))
> {code}
> results in a block until the task is finished.
> Instead I expected zero waiting, because {{java.util.concurrent.Future.get(long, TimeUnit)}} doesn't specify any special treatment for a zero timeout value.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (WFCORE-1418) Reloading host-controller via http-api puts the HC into unresponsive state
by ebuzer taha kanat (JIRA)
[ https://issues.jboss.org/browse/WFCORE-1418?page=com.atlassian.jira.plugi... ]
ebuzer taha kanat edited comment on WFCORE-1418 at 6/5/16 2:36 PM:
-------------------------------------------------------------------
Same thing happens non-periodically after multiple deploy-reload cycles unknown cause, undetectable time. only temp solution is restarting server .
Main problem is undertow when reload called sometimes not stopping.
wildfly-10.0.0.Final on ubuntu 14.04.4 lts, standalone-full mode starting with debian init script.
was (Author: urbandroid):
Same thing happens non-periodically after multiple deploy-reload cycles unknown cause, undetectable time. only temp solution is restarting server .
> Reloading host-controller via http-api puts the HC into unresponsive state
> --------------------------------------------------------------------------
>
> Key: WFCORE-1418
> URL: https://issues.jboss.org/browse/WFCORE-1418
> Project: WildFly Core
> Issue Type: Bug
> Components: Domain Management
> Affects Versions: 2.0.10.Final
> Reporter: Tomaz Cerar
> Assignee: Tomaz Cerar
> Priority: Blocker
> Fix For: 2.1.0.Final
>
>
> Reloading host-controller via http-api puts the HC into unresponsive state.
> *reproduce*
> \- create an administrative user admin:asdasd@2
> \- start a domain
> \- reload a server via http api
> {noformat}
> curl --digest -L -D - http://localhost:9990/management --header "Content-Type: application/json" -d '{"operation":"reload","name":"", "address":{"host" : "master"},"json.pretty":1}' -u admin:asdasd@2
> {noformat}
> *actual*
> Default server instances are stopped, HC is left in unresponsive state.
> Keeping the domain alive, following message will appear in 5 minutes, domain will become responsive again after that.
> {noformat}
> [Host Controller] 04:47:23,966 ERROR [org.jboss.as.controller.management-operation] (management task-7) WFLYCTL0349: Timeout after [300] seconds waiting for service container stability while finalizing an operation. Process must be restarted. Step that first updated the service container was 'reload' at address '[("host" => "master")]'
> {noformat}
> *expected*
> Domain is reloaded
> *additional info*
> The issue was introduced by fix for JBEAP-2751 - https://github.com/jbossas/wildfly-core-eap/commit/4986773a51fbf43ad911ae...
> thread dump of unresponsive HC
> http://pastebin.test.redhat.com/348732
> I am unable to reproduce locally, but issue can be easily reproduced on slower servers in MWQE lab. SSLMasterSlave*WayTestCase using reload via http-api cousing failures in domain modules of wf-core testsuite (e.g. [eap-7x-as-testsuite-test-core-rhel|https://url.corp.redhat.com/9f1f544] )
> Regression against 7.0.0.ER4. I was able to reproduce with the latest wildfly-core bits as well (1be598e)
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (WFCORE-1418) Reloading host-controller via http-api puts the HC into unresponsive state
by ebuzer taha kanat (JIRA)
[ https://issues.jboss.org/browse/WFCORE-1418?page=com.atlassian.jira.plugi... ]
ebuzer taha kanat commented on WFCORE-1418:
-------------------------------------------
Same thing happens non-periodically after multiple deploy-reload cycles unknown cause, undetectable time. only temp solution is restarting server .
> Reloading host-controller via http-api puts the HC into unresponsive state
> --------------------------------------------------------------------------
>
> Key: WFCORE-1418
> URL: https://issues.jboss.org/browse/WFCORE-1418
> Project: WildFly Core
> Issue Type: Bug
> Components: Domain Management
> Affects Versions: 2.0.10.Final
> Reporter: Tomaz Cerar
> Assignee: Tomaz Cerar
> Priority: Blocker
> Fix For: 2.1.0.Final
>
>
> Reloading host-controller via http-api puts the HC into unresponsive state.
> *reproduce*
> \- create an administrative user admin:asdasd@2
> \- start a domain
> \- reload a server via http api
> {noformat}
> curl --digest -L -D - http://localhost:9990/management --header "Content-Type: application/json" -d '{"operation":"reload","name":"", "address":{"host" : "master"},"json.pretty":1}' -u admin:asdasd@2
> {noformat}
> *actual*
> Default server instances are stopped, HC is left in unresponsive state.
> Keeping the domain alive, following message will appear in 5 minutes, domain will become responsive again after that.
> {noformat}
> [Host Controller] 04:47:23,966 ERROR [org.jboss.as.controller.management-operation] (management task-7) WFLYCTL0349: Timeout after [300] seconds waiting for service container stability while finalizing an operation. Process must be restarted. Step that first updated the service container was 'reload' at address '[("host" => "master")]'
> {noformat}
> *expected*
> Domain is reloaded
> *additional info*
> The issue was introduced by fix for JBEAP-2751 - https://github.com/jbossas/wildfly-core-eap/commit/4986773a51fbf43ad911ae...
> thread dump of unresponsive HC
> http://pastebin.test.redhat.com/348732
> I am unable to reproduce locally, but issue can be easily reproduced on slower servers in MWQE lab. SSLMasterSlave*WayTestCase using reload via http-api cousing failures in domain modules of wf-core testsuite (e.g. [eap-7x-as-testsuite-test-core-rhel|https://url.corp.redhat.com/9f1f544] )
> Regression against 7.0.0.ER4. I was able to reproduce with the latest wildfly-core bits as well (1be598e)
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (WFCORE-1427) Add a timeout param to reload op and use it for "graceful reload"
by Yeray Santana Borges (JIRA)
[ https://issues.jboss.org/browse/WFCORE-1427?page=com.atlassian.jira.plugi... ]
Yeray Santana Borges commented on WFCORE-1427:
----------------------------------------------
[~brian.stansberry] Just a note, reload command when it is used via CLI has a default connection timeout awaiting for the servers in "starting" status: org.jboss.as.cli.handlers.ReloadHandler L273, using the reload CLI together with a --timeout would require a different connection timeout instead of the default configuration.
I haven't changed it yet, maybe get the default configured connection timeout plus the --timeout used via CLI and use it instead in L273?, maybe check if the servers are in suspend and just wait before use the default connection timeout?
> Add a timeout param to reload op and use it for "graceful reload"
> -----------------------------------------------------------------
>
> Key: WFCORE-1427
> URL: https://issues.jboss.org/browse/WFCORE-1427
> Project: WildFly Core
> Issue Type: Enhancement
> Components: CLI, Domain Management
> Reporter: Brian Stansberry
> Assignee: Yeray Santana Borges
>
> So instead of
> {code}
> :suspend(20)
> :reload
> {code}
> It's just
> {code}
> :reload(20)
> {code}
> The high level 'reload' command in the CLI should take a --timeout param as well.
> If doing the graceful suspend as part of server side ":reload" handling proves problematic (I haven't looked into it at all before filing this) then a simpler alternative is to only go with the --timeout param on the CLI reload command, and have the CLI implement the graceful behavior internally by first calling :suspend and then :reload. Web console could do the same thing.
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months
[JBoss JIRA] (DROOLS-1198) NoClassDefFoundError when using str[endsWith] on a field that matches an imported class name
by Chris Austin (JIRA)
[ https://issues.jboss.org/browse/DROOLS-1198?page=com.atlassian.jira.plugi... ]
Chris Austin updated DROOLS-1198:
---------------------------------
Issue Type: Bug (was: Feature Request)
> NoClassDefFoundError when using str[endsWith] on a field that matches an imported class name
> --------------------------------------------------------------------------------------------
>
> Key: DROOLS-1198
> URL: https://issues.jboss.org/browse/DROOLS-1198
> Project: Drools
> Issue Type: Bug
> Components: core engine
> Affects Versions: 6.4.0.Final
> Reporter: Chris Austin
> Assignee: Mario Fusco
> Priority: Minor
>
> This error was triggered when accessing the user field with str[endsWith] when user is null:
> java.lang.NoClassDefFoundError: mssp/io/model/user (wrong name: mssp/io/model/User)
> Condition that triggers the error:
> $a : Alert(user!=null, user str[endsWith] "$")
> This does not trigger the error:
> $a : Alert(user!=null, user matches ".+\\$")
> If I remove the import for the User class the error will not be triggered. As a workaround I've switched to using matches instead of str[endsWith], but I'd prefer to switch back.
> I did not experience this issue in 6.3.0-Final, so it appears to be a regression in 6.4.0-Final
--
This message was sent by Atlassian JIRA
(v6.4.11#64026)
9 years, 11 months