[infinispan-dev] ISPN-3557: interactions between a clear() operation and a Transaction

Wed Oct 16 09:11:59 EDT 2013

On 16 October 2013 12:21, Galder Zamarreño <galder at redhat.com> wrote:
>
> Guys, there's some context to this topic that Sanne has left out.
>
> This discussion origins comes from Scott Marlow who's found a bug in a Hibernate 2LC test. The test itself can be found in [1].
>
> Basically, the test loads a couple of entities into the 2LC, and then calls evictAll in the cache that evicts all entities from the 2LC. Then, it expects the size of the cache to be 0 and entities to be gone from the 2LC.
>
> The 2LC has a peculiar way to deal with evictEntityRegions or evictAll, it doesn't just call clear() and hope for the best. Instead, it marks the cache as invalid, and next time someone queries the cache, it clears it in a separate transaction. This is done cluster wide too by sending messages to other node that the cache region is invalid.
>
> The problem here is that because it's done in a separate transaction, it doesn't have an impact on the on going transaction, so the assertions after evict fail. IOW, we implemented evict as a "eventually clear", where eventually all transactions will stop using the cached data, but any current transactions will still see data. We did this for performance reasons and to avoids the issues Sanne mentiones below.
>
> The problem is that we didn't envision this evict method to be used in such way. I always assumed this would be run outside of a transaction. IOW, you'd use this when you've updated the database separately (i.e. script, PL/SQL) and you want to clear the 2LC. I always thought that'd be something run via JMX or separate from another transaction.

That's indeed one of the use cases for a clear operation, but in this
case - which I'd classify as maintenance/manual control - I don't
think the user even expects a transaction to happen.
It could also happen by explicit invocation of user code, but I think
we're in the same use case.

> But, as Scott has found out, the TCK has some particular expectations of what happens after a evictEntityRegions or evictAll call within a transaction :(

To clarify, that's not just a TCK weirdness that we need to "hack
around" to pass it, it's an important use case, and as you say very
different than the "manual" control.
The scenario is that the user is invoking some batch update statement
on the database which potentially changes a set of entities, whose ids
might be unknown. To make a simple example, say the statement adds one
day to the age of all users and you want such a statement to run on
each midnight.
Such statements exist and are quite common (probably within the scope
of a better use case), and the point is that in such cases, and
evictAll(users) need to happen, but it should happen in the scope of
the transaction: isolation-wise, the current Session shouldn't be
loading any more stale data - and IF it loads from the DB fresh data,
this newly cached entries need to be confined to the current TX, and
the changes need to be made visible to other nodes and Sessions only
at commit: in other words, ideally you should roll back the eviction
from happening if the statement is rolled back.

Still considering it's a cache, with "ideally" I mean that it would
still be correct - although suboptimal - to evict all caches
immediately (out of transaction): worse case other sessions will have
slightly more cache misses, but they would still be correct.

So to summarise: Hibernate expects to be able to use it both in a
transactional mode and in a non-transactional mode. Steve Ebersole
pointed out that he would be open to discuss using two differently
named methods to make the expectation clear, as it currently just
relies on the fact that a transaction is active or not.

Dan has a great point about JCache, still we'd need to investigate
what the spec is saying about transactionality of such a method, it
would surprise me if it mandated transactionality?

I'll call myself out on how you're all thinking to implement it, as
long as the requirements described above are clear: I think they make
much sense for any general purpose cache and it's not just a weirdness
of the TCK designer or Hibernate. So the trouble for us is that
Infinispan no longer can operate on a cache mixing transactional and
non-transactional operations. That looks like something that needs to
be addressed, and considering the general use case for a clear I think
it would suffice to clearly mark clear as a non-transactional
operation.

BTW If I really need to clear it.. I don't care about it to be
consistently applied "all or nothing", it would be much more of a sane
decision to apply it at least where it worked fine. Being this an XA
operation seems nonsense.

That said, in the way a distributed key value store which aims at
bolting on eventual consistency & co, I would consider it damn
dangerous to embark in designing operations whose scope is not
restricted to a well defined set of keys. I tried to pitch my concern
at that level rather than jumping in the details of how to make the
TCK work. Conceptually, I think it's very tricky as you won't be able
to track any form of causality between this operation and subsequent
write operations. At some point you'll need to introduce tombstone
markers, and you need a to define what is your "keyset universe" to
implement such a method.

Cheers,
Sanne

>
> With this in mind, let me comment on the specifics below:
>
> [1] https://github.com/scottmarlow/wildfly/blob/b61dc1dddb1f70847d1d82206a366eedf6bc80db/testsuite/integration/basic/src/test/java/org/jboss/as/test/integration/jpa/secondlevelcache/SFSB2LC.java#L82
>
> On Oct 2, 2013, at 1:03 AM, Sanne Grinovero <sanne at infinispan.org> wrote:
>
> > I'd love to brainstorm about the clear() operation and what it means
> > on Infinispan.
> >
> > I'm not sure to what extent, but it seems that clear() is designed to
> > work in a TX, or even create an implicit transaction if needed, but
> > I'm not understanding how that can work.
> >
> > Obviously a clear() operation isn't listing all keys explicitly. Which
> > implies that it's undefined on which keys it's going to operate when
> > it's fired.. that seems like terribly wrong in a distributed key/value
> > store as we can't possibly freeze the global state and somehow define
> > a set of keys which are going to be affected, while an explicit
> > enumeration is needed to acquire the needed locks.
> >
> > It might give a nice safe feeling that, when invoking a clear()
> > operation in a transaction, I can still abort the transaction to make
> > it cancel the operation; that's the only good part I can think of: we
> > can cancel it.
> >
> > I don't think it has anything to do with consistency though? To make
> > sure you're effectively involving all replicas of all entries in a
> > consistent way, a lock would need to be acquired on each affected key,
> > which again implies a need to enumerate all keys, including the
> > unknown keys which might be hiding in a CacheStore: it's not enough to
> > broadcast the clear() operation to all nodes and have them simply wipe
> > their local state as that's never going to deal correctly
> > (consistently) with in-flight transactions working on different nodes
> > at different times (I guess enabling Total Order could help but you'd
> > need to make it mandatory).
> >
> > So let's step back a second and consider what is the use case for
> > clear() ? I suspect it's primarily a method needed during testing, or
> > maybe cleanup before a backup is restored (operations), maybe a
> > manually activated JMX operation to clear the cache in exceptional
> > cases.
>
> ^ Well, that's precisely the problem of second level cache evictEntityRegions or evictAll. In some scenarios, the clear is expected to affect the ongoing transaction too, see above use case from the TCK.
>
> > I don't think there would ever be a need for a clear() operation to
> > interact with other transactions, so I'd rather make it illegal to
> > invoke a clear() inside a transaction,
>
> -1, unless you have other ideas to deal with evict calls that I mentioned above that satisfy what the TCK is after.
>
> > or simply ignore the
> > transactional scope and have an immediate and distributed effect.
>
> That'd be more inline with what's needed above.
>
> > I'm likely missing something. What terrible consequences would this have?
>
> p.s. For the 2LC use case, I'm trying to see if the clear() could be done in the same transaction and see if that helps, but no luck yet as that causes other failures.
>
> >
> > Cheers,
> > Sanne
> > _______________________________________________
> > infinispan-dev mailing list
> > infinispan-dev at lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
> --
> Galder Zamarreño
> galder at redhat.com
> twitter.com/galderz
>
> Project Lead, Escalante
> http://escalante.io
>
> Engineer, Infinispan
> http://infinispan.org
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev