[infinispan-dev] ISPN-3557: interactions between a clear() operation and a Transaction

Thu Oct 3 06:57:56 EDT 2013

On 3 October 2013 10:29, Dan Berindei <dan.berindei at gmail.com> wrote:
>
> On Wed, Oct 2, 2013 at 12:48 PM, Pedro Ruivo <pedro at infinispan.org> wrote:
>>
>>
>>
>> On 10/02/2013 12:03 AM, Sanne Grinovero wrote:
>> > I'd love to brainstorm about the clear() operation and what it means
>> > on Infinispan.
>> >
>> > I'm not sure to what extent, but it seems that clear() is designed to
>> > work in a TX, or even create an implicit transaction if needed, but
>> > I'm not understanding how that can work.
>
>
> I think it's pretty straight-forward to define the semantics of operations
> after clear(): they should assume that every key in the cache was removed.
>
> True, like Pedro mentioned, the current implementation allows subsequent
> operations in the same tx to see keys inserted by other txs. But I don't
> think fixing that would be harder than ensuring that if a tx inserts key k
> owned by nodes A and B, clear() from outside the tx either deletes k on A
> and B or doesn't delete k on either node.
>
>
>>
>> >
>> > Obviously a clear() operation isn't listing all keys explicitly.
>>
>> (offtopic) like entrySet(), keySet(), values() and iterator() when
>> executed in a transactional context. however, I think that this
>> operations are more difficult to handle, because they have to return
>> consistent data...
>
>
> I agree, defining how entrySet() and the others should work in a tx seems
> harder than defining clear().

I agree, those other nasty beasts are next on my list: see below..
But we're not arguing about which method deserves the gold medal for
complex design ;-)

> Will, I'm guessing your fix for ISPN-3560 still allows entrySet() after
> clear() to see entries inserted by another tx in the meantime?
>
>>
>> > Which
>> > implies that it's undefined on which keys it's going to operate when
>> > it's fired.. that seems like terribly wrong in a distributed key/value
>> > store as we can't possibly freeze the global state and somehow define
>> > a set of keys which are going to be affected, while an explicit
>> > enumeration is needed to acquire the needed locks.
>>
>> you may not need to explicit acquire the lock for the affected key. you
>> can use a global lock that is share acquired for write transactions
>> without clear operation and exclusively acquire for transactions with
>> clear operation (exactly what I use for Total Order). also, the clear
>> transactions (i.e. the transactions with clear operation) can also only
>> acquired the global lock.
>
>
> +1 for the global lock. It may slow things down a bit in the general case,
> because every tx will have to acquire it in shared mode, but it will
> guarantee consistency for clear().

I had thought of that too, but it will slow down all operations, at
which benefit?
To have a nice transactional clear() operation which guarantees consistency?

The goal of this thread is to challenge if that makes sense for real use cases,
it seems in fact we would highly prefer a non transactional version,
and considering that we can't run a single Cache in both modes anymore
that's a problem.

Side note on the implementation complexity:
even the AsyncCacheStore decorator code is made extremely more tricky
just to handle clear() appropriately, to keep it in the correct order
of execution
compared to the other operations, while still allowing
"coaleshing" of other operations, which essentially implies we allow
reordering of operations as long as they affect independent keys.. where clear()
is of course a jolly which requires it's own execution path, its own global lock
and generally makes it far more complex.

>
> It still won't help entrySet() and its brethren...
>
>>
>> >
>> > It might give a nice safe feeling that, when invoking a clear()
>> > operation in a transaction, I can still abort the transaction to make
>> > it cancel the operation; that's the only good part I can think of: we
>> > can cancel it.
>>
>> yep
>
>
> Wouldn't it be harder to integrate it with Total Order if it didn't have a
> tx?
>
> But most of all, I think we have to keep it because it's a part of the Map
> interface and a part of JCache's Cache interface.
>
>>
>>
>> >
>> > I don't think it has anything to do with consistency though? To make
>> > sure you're effectively involving all replicas of all entries in a
>> > consistent way, a lock would need to be acquired on each affected key,
>> > which again implies a need to enumerate all keys, including the
>> > unknown keys which might be hiding in a CacheStore: it's not enough to
>> > broadcast the clear() operation to all nodes and have them simply wipe
>> > their local state as that's never going to deal correctly
>> > (consistently) with in-flight transactions working on different nodes
>> > at different times (I guess enabling Total Order could help but you'd
>> > need to make it mandatory).
>> >
>>
>> see the comment above about the locks... may not be the most efficient,
>> but handles the problem.
>
>
> Note that we need to iterate the keys anyway, in order to fire the
> CacheEntryRemoved listeners. Even if we optimized it skip the iteration when
> there are no registered listeners, it would still have to handle the general
> case.
>
>>
>> > So let's step back a second and consider what is the use case for
>> > clear() ? I suspect it's primarily a method needed during testing, or
>> > maybe cleanup before a backup is restored (operations), maybe a
>> > manually activated JMX operation to clear the cache in exceptional
>> > cases.
>>
>> I'm not aware of the uses cases of a clear() operation, but use it as
>> JMX operation can be a problem in transactional caches (depending the
>> semantic you have in mind).
>>
>> If it is going to clear all the caches (i.e. from all replicas/nodes),
>> then we have to find a way to set an order with the current transactions
>> or we will have something some nodes delivering a transaction (say tx1)
>> before the clear() and other nodes delivering tx1 after the clear(),
>> leading to an inconsistent state.
>>
>> (offtopic) I'm wondering if we have that problem in the current
>> implementation if tx1 is only insert new keys... in that case, we have
>> no lock to serialize an order between the clear() and tx1...
>
>
> I'm pretty sure we do have this problem. We definitely need to add some
> tests for clear concurrent with other write operations.
>
> In my view, as long as we're going to expose a clear() operation in some
> way, it's going to be easiest to include it in a tx in transactional caches.
> Exposing it in some other way is not going to make it any more consistent,
> although prohibiting other operations in the same tx may make the code a bit
> simpler.
>
>>
>>
>> >
>> > I don't think there would ever be a need for a clear() operation to
>> > interact with other transactions, so I'd rather make it illegal to
>> > invoke a clear() inside a transaction, or simply ignore the
>> > transactional scope and have an immediate and distributed effect.
>
>
> We can't control whether other transactions are invoked concurrently with
> clear(), so we can't make that illegal... If you mean interaction with other
> operations in the same transaction, I agree that it's not absolutely
> necessary to allow them, but I think most of that work has already been
> done.
>
>>
>> I think the second solution can make inconsistency data ("or simply
>> ignore the transactional scope and have an immediate and distributed
>> effect")
>
>
> Maybe that's the idea? Sanne, do you mean that if clear() were exposed via a
> different API, it wouldn't have to provide the same consistency guarantees
> as regular operations?

Yes that's the main point that we need to clarify.
Note that this discussion started because of a problem we're having in
the integration
of Infinispan as Hibernate 2nd level cache, which is a quite mature
contract with
some very clear ideas of how a cache needs to behave.

After more discussions yesterday, I think I'm understanding (Galder
will likely know more
but is currently away) that ideally a cache should be able to provide
both a transactional
clear() operation and an immediately-affecting one.
The only need for the transactional one is to be able to undo the
operation in case the
user's transaction is rolled back: we don't even support this case
today since it's suspending
the transaction to run the clear() operation in an independent transaction.

But the transactional clear() operation is far less important than
having a clear "evict all now"
operation. That's one of the basic operations a good mannered Cache
should provide:
remember that as a cache it's always a valid answer to return "null",
it's just suboptimal.
Returning the wrong answer instead, can have critical consequences, so
it's expected to
provide a way for the admin (JMX) or driving application to
immediately wipe the state;
I would argue that even a glocal lock's timing implications are
undesirable in such a case.

Focusing on the identify of Infinispan as a Key/Value store, I'd
consider it almost mandatory
for the client to know - to explicitly list - the keys it's interested
into, and be completely oblivious
to the entries which you're unable to list. For example, think of a
cache shared by multiple tenants
or multiple apps, which use keys which for some reason can't return
true to an equality check
(for example because they're coming from different classloaders).
In such a perspective, any operation which affects (or enumerates)
keys is a bad practice and
should imho be banned from the user API, unless for administration or
monitoring purposes.

I think this is quite clearly expressed already by all the warning in
javadoc on methods like
  org.infinispan.Cache.keySet()
  org.infinispan.Cache.size()
  ...
They all list a warning about needing to be used only for debug
purposes: I don't think we need
to waste time into making these transitionally consistent, as people
shouldn't be using them.

If a transactional clear() operation is possible without overhead,
then the 2nd level cache code
would benefit from it, but it's a borderline decision and its code
could be refactored to not need this,
at least that's my understanding so far and I guess we need Galder for
that to be clarified.

Sanne

>
>>
>>
>> >
>> > I'm likely missing something. What terrible consequences would this
>> > have?
>> >
>> > Cheers,
>> > Sanne
>> > _______________________________________________
>> > infinispan-dev mailing list
>> > infinispan-dev at lists.jboss.org
>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> >
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev