[
http://opensource.atlassian.com/projects/hibernate/browse/HV-377?page=com...
]
Hardy Ferentschik updated HV-377:
---------------------------------
Description:
Hi,
I'm using HibernateValidator on my current project. In some cases I need to save very
complex entities (trees of nodes associated with other set of entities, etc). Saving an
entity like this takes me a considerable amount of time if validation is turned on. Trying
to solve this issue, I've been doing some profiling lately.
The bottleneck is at ClassValidator.getClassValidator(). I get this method executed around
20000 times. When validation is on, it takes like 10 seconds, without validation it takes
like 1.5 seconds. So what's going on?
getClassValidator() searches in a cache for any possible ClassValidator created at the
constructor. The constructor of a class examines all its relationships with related
classes and saved them in a cache. The point is that I'm getting a lot of misses for
entities of type Collection (PersistentCollection, Collection.$Unmodifiable, Set, etc).
Since getClassValidator() doesn't find these entities in the cache (which makes
sense), a new ClassValidator is created, which is not cheap in execution time terms. So,
considering that the number of calls is considerable, that's the reason for the
bottleneck.
I checked there's comment in the code suggesting adding a second cache for saving new
ClassValidators when a miss happens. My first approach was to code this extra-cache, and
things improved enormously (no differences between validating with and without
validation).
But, there's still something I don't fully understand. In the method: protected
InvalidValue[] getInvalidValues(T bean, Set<Object> circularityState), there's a
loop that examines the class of an entity and do the actual validation. The body of this
loop is coded as:
{code}
if ( getter.isCollection() ) {
// Validate for collections
}
if ( getter.isArray() ) {
// Validate for Arrays
} else {
// Validate for anything else is not an Array
}
{code}
The point is that for entities of type Collection the validation is being done twice. Once
on the first branch (Validate for collections) and another time on the "else"
branch (Validate for anything else is not an Array).
Imagine an entity PersistenCollection<Person>. The first branch validates all the
people in the Collection and the else branch creates a ClassValidator of type
PersistenCollection and executes its validators (that doesn't make much sense to me).
Most of the misses I got on the "else" branch are for entities of type
Collection, I got some others for entities of type ValueObject I think, those ones are OK.
So, why this checkings are not coded in exclusive form, something like:
{code}
if ( getter.isCollection() ) {
// Validate for collections
}
else if ( getter.isArray() ) {
// Validate for Arrays
} else {
// Validate for anything else is not an Array neither a Collection
}
{code}
Before sending a patch for adding a second cache to getClassValidator() method, I'd
like to know if most of this could get fixed by validating in exclusive form at
getInvalidValues(). In any case, the second cache patch is also nice but I guess that
could be subject of a another thread.
Regards,
Diego
was:
Hi,
I'm using HibernateValidator on my current project. In some cases I need to save very
complex entities (trees of nodes associated with other set of entities, etc). Saving an
entity like this takes me a considerable amount of time if validation is turned on. Trying
to solve this issue, I've been doing some profiling lately.
The bottleneck is at ClassValidator.getClassValidator(). I get this method executed around
20000 times. When validation is on, it takes like 10 seconds, without validation it takes
like 1.5 seconds. So what's going on?
getClassValidator() searches in a cache for any possible ClassValidator created at the
constructor. The constructor of a class examines all its relationships with related
classes and saved them in a cache. The point is that I'm getting a lot of misses for
entities of type Collection (PersistentCollection, Collection.$Unmodifiable, Set, etc).
Since getClassValidator() doesn't find these entities in the cache (which makes
sense), a new ClassValidator is created, which is not cheap in execution time terms. So,
considering that the number of calls is considerable, that's the reason for the
bottleneck.
I checked there's comment in the code suggesting adding a second cache for saving new
ClassValidators when a miss happens. My first approach was to code this extra-cache, and
things improved enormously (no differences between validating with and without
validation).
But, there's still something I don't fully understand. In the method: protected
InvalidValue[] getInvalidValues(T bean, Set<Object> circularityState), there's a
loop that examines the class of an entity and do the actual validation. The body of this
loop is coded as:
if ( getter.isCollection() ) {
// Validate for collections
}
if ( getter.isArray() ) {
// Validate for Arrays
} else {
// Validate for anything else is not an Array
}
The point is that for entities of type Collection the validation is being done twice. Once
on the first branch (Validate for collections) and another time on the "else"
branch (Validate for anything else is not an Array).
Imagine an entity PersistenCollection<Person>. The first branch validates all the
people in the Collection and the else branch creates a ClassValidator of type
PersistenCollection and executes its validators (that doesn't make much sense to me).
Most of the misses I got on the "else" branch are for entities of type
Collection, I got some others for entities of type ValueObject I think, those ones are OK.
So, why this checkings are not coded in exclusive form, something like:
if ( getter.isCollection() ) {
// Validate for collections
}
else if ( getter.isArray() ) {
// Validate for Arrays
} else {
// Validate for anything else is not an Array neither a Collection
}
Before sending a patch for adding a second cache to getClassValidator() method, I'd
like to know if most of this could get fixed by validating in exclusive form at
getInvalidValues(). In any case, the second cache patch is also nice but I guess that
could be subject of a another thread.
Regards,
Diego
Improve execution times on calling getClassValidator()
------------------------------------------------------
Key: HV-377
URL:
http://opensource.atlassian.com/projects/hibernate/browse/HV-377
Project: Hibernate Validator
Issue Type: Improvement
Components: engine
Affects Versions: 3.1.0.GA
Environment: Hibernate 3.0, PostgreSQL
Reporter: Diego Pino
Assignee: Hardy Ferentschik
Fix For: 3.2.0
Hi,
I'm using HibernateValidator on my current project. In some cases I need to save very
complex entities (trees of nodes associated with other set of entities, etc). Saving an
entity like this takes me a considerable amount of time if validation is turned on. Trying
to solve this issue, I've been doing some profiling lately.
The bottleneck is at ClassValidator.getClassValidator(). I get this method executed
around 20000 times. When validation is on, it takes like 10 seconds, without validation it
takes like 1.5 seconds. So what's going on?
getClassValidator() searches in a cache for any possible ClassValidator created at the
constructor. The constructor of a class examines all its relationships with related
classes and saved them in a cache. The point is that I'm getting a lot of misses for
entities of type Collection (PersistentCollection, Collection.$Unmodifiable, Set, etc).
Since getClassValidator() doesn't find these entities in the cache (which makes
sense), a new ClassValidator is created, which is not cheap in execution time terms. So,
considering that the number of calls is considerable, that's the reason for the
bottleneck.
I checked there's comment in the code suggesting adding a second cache for saving new
ClassValidators when a miss happens. My first approach was to code this extra-cache, and
things improved enormously (no differences between validating with and without
validation).
But, there's still something I don't fully understand. In the method: protected
InvalidValue[] getInvalidValues(T bean, Set<Object> circularityState), there's a
loop that examines the class of an entity and do the actual validation. The body of this
loop is coded as:
{code}
if ( getter.isCollection() ) {
// Validate for collections
}
if ( getter.isArray() ) {
// Validate for Arrays
} else {
// Validate for anything else is not an Array
}
{code}
The point is that for entities of type Collection the validation is being done twice.
Once on the first branch (Validate for collections) and another time on the
"else" branch (Validate for anything else is not an Array).
Imagine an entity PersistenCollection<Person>. The first branch validates all the
people in the Collection and the else branch creates a ClassValidator of type
PersistenCollection and executes its validators (that doesn't make much sense to me).
Most of the misses I got on the "else" branch are for entities of type
Collection, I got some others for entities of type ValueObject I think, those ones are OK.
So, why this checkings are not coded in exclusive form, something like:
{code}
if ( getter.isCollection() ) {
// Validate for collections
}
else if ( getter.isArray() ) {
// Validate for Arrays
} else {
// Validate for anything else is not an Array neither a Collection
}
{code}
Before sending a patch for adding a second cache to getClassValidator() method, I'd
like to know if most of this could get fixed by validating in exclusive form at
getInvalidValues(). In any case, the second cache patch is also nice but I guess that
could be subject of a another thread.
Regards,
Diego
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://opensource.atlassian.com/projects/hibernate/secure/Administrators....
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira