On 03/29/2018 02:11 PM, Steve Ebersole wrote:
Thanks for the thoughts Radim.
But I think that there is a misunderstanding..
Today (pre-5.3) Hibernate has a hibernate-infinispan module that
integrates Infinispan into Hibernate ORM as a second-level cache. In
fact it provides 2 integrations: `InfinispanRegionFactory` and
`JndiInfinispanRegionFactory`. And then inside WildFly there is (at
least) a third one specific to running inside WildFly. Inside WildFly
is irrelevant... I say that because inside WildFly we are going to use
whatever the Infinispan team develops as infinispan-hibernate-cache.
Which leaves the other 2. The only thing (I believe) that
`InfinispanRegionFactory` and `JndiInfinispanRegionFactory` provide
over (and actually in any way different from) hibernate-jcache is
transactional access support, which you even say is going away.
I understand about not being able to fully leverage
`CacheTransactionSynchronization` if hibernate-jcache is used. IMO it
is the single drawback to using hibernate-jcache for Infinispan. All
of the other things you mentioned are IMO implementation details.
Well that and missing out on the new `DomainDataRegionConfig` stuff.
And keep in mind that even hibernate-jcache offers some integration
points (thanks to Alex)... you can supply your own
`javax.cache.spi.CachingProvider` or `javax.cache.CacheManager`
directly to the RegionFactory for JCache. So you have total control
still over how caches are built (sans access to `DomainDataRegionConfig`).
So all-in-all I am still not understanding why `hibernate-jcache` +
`infinispan` is not enough for most cases. Yes you miss out on some
possible performance improvements (leveraging
`CacheTransactionSynchronization` and `DomainDataRegionConfig` being
the 2 I can think of).
Maybe I understand what hibernate-jcache is incorrectly. Infinispan
exposes JCache API, and JCache dictates some behaviour. The behaviour
with transactions is not specified and not tested by the JCache TCK.
Hibernate-jcache consumes JCache API. I don't see any space for
'integration' besides dropping some configuration files (and that's the
point, no code to maintain).
My post was about the behaviour that can be optimized once we know the
intent. JCache API is implemented according to the spec and does not
express Hibernate's intent well enough.
However, that's why there is the possibility of a custom cache
provider - to leverage those things. But again, taking a step back
and looking at the big picture...
1) We need to be able to run outside of WildFly. The 2 legacy
`hibernate-infinispan` providers were intended for that.
2) We need to be able to run inside of WildFly. The provider supplied
with WildFly (Jipijapa) itself was intended for that.
So how do we do that moving to 5.3? As I mentioned above, I think we
can ignore "run inside of WildFly" for this discussion - that will
happen however Jipijapa says it will happen; if/when there is a
specific Infinispan-based 5.3 cache provider, Jipijapa will use that.
It's more the "run outside of WildFly" case that I am talking about.
So what are the things that we want to cover in Hibernate/Infinispan
caching combo? AFAIK:
1. Transactional access support
2. Cluster support (distributed versus replicated versus invalidated)
3. Anything I am missing?
So let's look at those 1-1:
1. You just said that you plan to drop transactional acccess support
anyway, so nothing to see here...
I've never said that I am dropping support for transactional *access*!
org.hibernate.cache.spi.access.AccessType.TRANSACTIONAL will be always
supported. We're dropping support for transactional *caches*.
1. I may be wrong here, but I do not think that
hibernate-infinispan
supported clustering (or at least very well) out-of-the-box. At
least not based on the top google hits I saw for "hibernate
infinispan cluster".
Clustered support is default, you need to specify non-clustered
configuration file via property if you don't want that one.
Given these points and the fact that Infinispan can (and would
already
have to) implement JCache's `CachingProvider` / `CacheManager`, I
simply do not buy that `hibernate-jcache` + `infinispan` is any less
capable than the support we have today for Infinispan as a second
level cache. Now, if you say that you want to develop a 5.3 provider
to actually make cluster setup easier, leverage
`CacheTransactionSynchronization` and/or other specific reasons....
great, that is completely your prerogative and the reason this is all
still pluggable.
P.S. JTA is a spec. JTA components are accessible from a "well known
location" (aka, easily accessible by anyone). So I really just do not
get your argument that `hibernate-jcache` + `infinispan` somehow loses
the ability to leverage JTA. And btw this was absolutely not the
intent with `CacheTransactionSynchronization`, which was instead
intended to allow unified processing of "cache events" at the "end of
a transaction" (as known to Hibernate) regardless of whether JTA or
straight JDBC transactions are used.
I have complained about JCache + JTA being undefined, nothing about
leveraging JTA in our impl.
P.S.S. I totally agree with Sanne. The cache should be as correct as
possible, however it is *always* possible to simply evict a piece of
data from the cache to avoid conflicts. The database is *always* the
"truth of the system". This in in fact exactly the principle that the
collection cache works - any changes to that collection simply
invalidate (evict) the data from the cache.
When you simply evict a piece of data from the cache you can't be sure
that that piece won't end up in there right away because another
concurrent request holding stale data hits the cache again.
R.
On Thu, Mar 29, 2018 at 3:03 AM Radim Vansa <rvansa(a)redhat.com
<mailto:rvansa@redhat.com>> wrote:
Hi Steve,
on HipChat you've asked why hibernate-jcache + Infinispan's
implementation of JCache is not enough. It ultimately boils down to
1. performance
2. correctness
where (2) can be fine with some providers but then (1) suffers.
Infinispan offers transactional mode, but it's not serializable (gosh,
sometimes it's even read_uncommitted) and has its quirks. The
performance can't be as good as with non-tx mode, too. That's why the
native transactional caches support will be dropped in 5.3 and we'll
just emit a warning to update their configuration (and continue with
non-tx config).
As a demonstration of this we can use the putFromLoad. If you
implement
this as a (ideal serializable) transactional cache putIfAbsent, the
provider must either
a) lock the key (pessimistic approach) - but we don't want to block
other nodes removing data from the cache (on write) or putFromLoading
(on read!)
b) resolve conflicts when the transaction is committing: you
figure out
that there are two concurrent updates and rollback one of the
transactions - that's not acceptable to us either, as any failure in
cache should not affect DB transaction. And there's a risk of blocking
between the 2 phases of commit, too.
Theoretically you could just wipe any modified data on conflict - I
don't know if anyone does that, 'drop everything and proceed with
commit' is not something I'd expect from a general-purpose (NoSQL)
DB. I
recall Alex's JCache implementation (for 5.2) storing some 'lock'
objects in the cache, and you probably don't want to wipe those.
Interaction with evictAll/removeAll could be also problematic: not
sure
about the other providers but Infinispan's clear() operation is
non-transactional even on tx cache (since Infinispan 7 or so) because
it's impractical to resolve all conflicts. I don't know details how
others provide that operation but there may be a hidden problem.
Last but not least, you assume that the provider is transactional
and it
provides JCache interface. JCache does not define interaction with
JTA,
because it was hard to get agreement on non-tx behaviour (why did it
take 13 years to complete the JSR?) and it would be even harder
for JTA.
So what you expect is just your extrapolation or wishful thinking, and
it's up to integrators to verify that the unwritten contract is
fulfilled within the scope of hibernate-jcache module use. Not
that SPI
implementors would be in a better position, but at least we are aware
that (for us) it's not enough to implement those 3 classes and
job's done.
Of course the correctness aspect may be ignored with 'it's just a
cache'
implying 'users expect stale/uncommitted data' as Sanne (who is much
closer to the customers than me) keeps repeating. However this is not
what 2LC promises as I understand it: the same results as DB would do.
I am really grateful that in 5.3 you've provided the
CacheTransactionSynchronization that will help us boost (1) even
further
by allowing us to execute all operations in parallel. And it's
good that
you've made the SPI more expressive with the intent; there'll be a
bunch
of TODOs in the 5.3 implementation to cover use cases that were not
handled in previous versions but now are obvious.
Cheers
Radim
--
Radim Vansa <rvansa(a)redhat.com <mailto:rvansa@redhat.com>>
JBoss Performance Team
_______________________________________________
hibernate-dev mailing list
hibernate-dev(a)lists.jboss.org <mailto:hibernate-dev@lists.jboss.org>
https://lists.jboss.org/mailman/listinfo/hibernate-dev
--
Radim Vansa <rvansa(a)redhat.com>
JBoss Performance Team