JBoss development,
A new message was posted in the thread "Problems with proxy classes and a proposed
resolution":
http://community.jboss.org/message/532305#532305
Author : Andrew Dinn
Profile :
http://community.jboss.org/people/adinn
Message:
--------------------------------------------------------------
This discussion is prompted in part by
http://community.jboss.org/thread/148815?tstart=0
and in part by looking at
https://jira.jboss.org/jira/browse/JASSIST-42 which identifies
various problems with the current use of writeRepace to support proxy serialization. The
comments on JASSIST-42 contain a prescription for how to solve the serialization problems
provided by David Lloyd. I impemented tis solution and, unfortunately, although it almost
works is still suffers from a couple of serious problems. Although it might be possible to
get round these the solution requires javassist clients to do a lot of extra work and is
likely to be error-prone.
I have identifed an alternative solution which should also ensure that cached proxy
classes can be reused much more often +both+ when performing serialization +and+ in normal
use of proxies. The only drawback is a very small change to the proxy factory/class API
which will require existinng client code to be modified very slightly. The proposed fix
can be used with the writeReplace style serialization solution to ensure that deserialized
proxies can share the same cached class and do not fail to find the correct class. It can
also be used to support serialization/eserialization using David Lloyd's proposal
which provides much better quality serialiation (e.g. superclass fields are not lost) but
is a tiny bit harder for clients to use. I have nearly finished a prototype which enables
both alternatives to be made available so, modulo the small factory API change, we can
retain compatibility for code which currently uses writeReplace based serialization.
There are two related problems with the current proxy cache. The first is to do with
identity, the second with naming.. The identity of a cache entry (a proxy class)
associated wiht some given classloader is determined from four sets of data:
* the super class s
* the interface set i1,...,ik
* the proxy filter f
* the proxy handler h
Following the fix for
https://jira.jboss.org/jira/browse/JASSIST-104 the per-classloader
cache is a hashmap keyed by a string derived from the names of the classes in the first
two data sets: i.e. of the form sn$i1n$...ikn$. where sn is the (String) name of s, i1n is
the name of i1 etc All proxy classes with a given super and interface set are stored in a
list under this key. The list contains entries each of which stores a filter f1, handler
h1 and weak reference to the associated class pc1. So, when creating a class with the
specified data set s, i1...,ik, f, h the key is constructed and the relevant list located
and traversed looking for an entry where f1 == f and h1 == h.
This already wastes an opportunity for reuse of proxy classes. If f1 and f2 are two
filters which behave exactly the same then a class created with filter f1 will not be
reused when creating a class withh filter f2 even if all the other elements of the
iidentifying set are the same.
The problem with serialization is more severe. The current scheme based on use of method
writeReplace serializes a proxy object by substituting an instance of SerializedProxy
containing the identifying details sn, i1n..,ikn, f and h. At the deserializing end this
results in objects sn', i1n'...,ikn', f' anf h'. The deserialized
instance of SerializedProxy uses these details to lookup or craeate a proxy class pc'
and create an instance p' which it then initialises with the correct handler.
Unfortunately, this also loses all data locate in instance fields iniherited from
superclass s.
Even if the deserialize is done in the same JVM as the serialize operation f' and
h' will not equal f and h so a cache lookup will fail to find the originating class pc
from which the proxy was serialized. If it is done in a different JVM the same problem
arises. The instances will be new every time they are read from the stream so they can
never match cache entries. So, each time that, say, a remoting call is made serializing
an instance of proxy class pc down the wire will cause a new proxy class pc' to be
created and the cache soon blows up.
The same problem affects the scheme suggested by David Lloyd which avoids using
writeReplace. Instead the serialize and deserialize operations must be done using a
subclass of ByteOutputStream and ByteInputStream defined by the javassist package. When a
ProxyObjectOutputStream detects that it is serializing a proxy instance p the stream
identifies the proxy's class pc, and writes sn, i1,... etc into the stream before
using the normal serialization code to serializethe field values stored in p. At the
other end the ProxyObjectOutputStream recognises that details of a proxy class have been
included. It reads sn', i1n' etc and tries to lookup and retturn a cached proxy
class pc' identified by these values.. The parent code in ObjectInpuStream creates an
instance p' from this class and initialises its fields using the data serialized from
p. Once again this can never reuse a cached class and will always end up creating a new
class pc' since f' and h' are new objects so will never identify a cache
entry
The second problem is to do with how proxy classes are named. Clearly, there can be more
than one class with super s and interface list i1,..,ik. So, the current code creates a
name of the form javassist.proxy$sn$i1n$...$ikn_37 whenever it creates a proxy class. The
number at the end is derived by incrementing a static counter in class ProxyFactory. So,
when an instance p of class pc is serialized and then deserialized the new class created
at the other end may have a name like javassist.proxy$sn$i1n$...$ikn_26 i.e. with a
different numeric tag at the end. It is very unlikely that class names will actually match
up. This change of classname is not fatal for the writeReplace based serialization scheme.
However, if kills david Lloyd's alternative. The class pc' returned by the
ProxyObjectOutputStream must have the same name as the original class pc. If the names
don't match the deserialize throws an exception.
One way to fix this would be to require filter and handler classes to +implement+ equality
and naming. If the classes fc and hc of filters f and h implemented method equals() then
the cache lookup could succeed if f'.equals(f) and h'.equals(h). The client would
have to ensure that for any two filters f and f' f.equals(f') if and only if any
method selected by f would be selected by f' and any method rejected by f would also
be rejected by f'. This is feasible but probably somewhat error prone. Handler classes
would also need to implement a coherent equals method.
It would also be necessary to implement a matching getName() method for each filter or
handler class which returned a String to be included in the proxy class name. So, if
f.getName() ==> fn, h.getName() ==> hn then the proxy class would have name
javassist.proxy$sn$i1n$...$ikn$fn$hn. These methods woudl need to esnure that where
f1.equals(f2) then f1.getName.equals(f2.getName()) and similarly for h1, h2. This would
ensure that both serialization schemes worked and were able to reuse cache entries albeit
at some cost in code and brainpower to clients.
Rather than impose this burden on the programmer I suggest a simpler solution. First,
I'll make a couple of important observations.
For a given pair of proxy classes p1 and p2 derived form clases pc1 and pc2 which differ
in their handlers h1, h2 but share the same super s, interfaces i1,...ik and filter f the
classes pc1 and pc2 are almost identical in behaviour. Since they share the same filter
pc1 and pc2 will call the instance's handler in exactly the same cases and will both
rely on the methods inherited from s in all other cases.The only point where the classes
idffer is that the static handler field in pc1 and pc2 may differ and this affects the
initial value of the handler assigned to their innstances So, if p1.handler and p2.handler
ar expiicitly set by the client p2 could be derived from pc1 and we could dispense with
pc2. If we remove this static default handler field and require the cleint to always call
setHandler on a new instance then we can remove any dependence of the proxy class on the
handler value. This means we can remove h from the set of identifying data used to ientify
a proxy class..
So, to achieve this we need to decommission Proxyfactory.setHandler(), remove the static
fields in the handler class and change the generator for the proxy lass constructors so
that the instance's handler is initialised using the global defaul in ProxyFactory
rather than looking for the proxy class default.
The second observation is that the identity of two filter instances f1 and f2 can be
ascertained automatically and can also be encoded automatically into a unique name. The
list of methods m1,...,mn impllemented by s, i1,...ik is determinate and of fixed length.
So, we can encode a bit vector of length n using a 1 at position i if f1 redirects calls
to mi and a 0 if f1 relies on the impllementation inherited from s. If f2 encodes the same
bit vector it will generate essentially the same class. We can also translate this bit
vector to a filter signaure key f1n, a hex String which will looksomething lik, say, fa08.
Any equivalent filter method f2 will also have the same signature key.
This solves our identity and naming problem. If we store the bit vector in the class as an
int[] and construct a name for the proxy class wich includes this filter name -- something
of the form javassist.proxy.$sn$i1n$...$ikn$fn -- then we can guarantee to be able to
identify and reuse equivalent proxy classes both under a call to createClass and also
during deserialization.
I would prefer if we were to drop method
ProxyFactory.setFilter()
and instead require clients to call
ProxyFactory.createClass(MethodFilter).
If instead the client calls
ProxyFactory.createClass().
then we generate a class which redirects all methods via the handler. Alternatively, we
can retain teh setFilter method on the factory and use it to filter when createClass is
called.
So, if we swallow these small API changes then we can fix the current proxy model and we
can also implement David LLoyd's alternative serialization model. Note that we
don't want to store the filter instance in the class as this may cause classloader
issues when we reuse a class whose filter belongs to a classloader which is not known at
the deserializing end . However, storing an int[] or the String filter signature will
cuase no such problems.
For the serialization fix I believe we can retain compatibility by providing a proxy
factpry settting. By default a factory will create classes which implement writeReplace
and can be serialized via normal ObjectOutput/InputStreams. We can provide a method
ProxyFactory.setUseWriteReplace(boolean) which can switch off generation of this method
when rerating new proxy classes. Instances of these class must be serialized using a
ProxyObjectOutputStream and deserialized using a ProxyObjectInputStream. However, this
will enable them to be fully serialized and reconstituted, avoidning the problems David
mentions in JASSIST-42.
I am very close to having a prototype implementation of tis behaviour (I got most of the
way there identifying the two critical failings). I'll complete this and post a patch.
However, I;'m interested to hear feedback form anyone who may be affected by the API
changes here.
regards,
Andrew Dinn
--------------------------------------------------------------
To reply to this message visit the message page:
http://community.jboss.org/message/532305#532305