[infinispan-dev] PutForExternalRead and autoCommit

Galder Zamarreño galder at redhat.com
Mon Nov 21 04:34:43 EST 2011


On Nov 17, 2011, at 4:39 PM, Slorg1 wrote:

> Hi all,
> 
> See comment below,
> 
> On Thu, Nov 17, 2011 at 10:20, Galder Zamarreño <galder at redhat.com> wrote:
>> 
>> On Nov 17, 2011, at 2:54 PM, Manik Surtani wrote:
>> 
>>> On 17 Nov 2011, at 09:30, Galder Zamarreño wrote:
>>> 
>>>> Hi,
>>>> 
>>>> Forcing caches to be either transactional or non transactional caches causes some issues with operations such as putForExternalRead with default configuration options.
>>>> 
>>>> Assuming we have a transactional cache, if autoCommit is on (default), putForExternalRead will:
>>>> 1. Suspend the ongoing transaction
>>>> 2. Will create a brand new transaction due to implicit transaction creation logic in auto commit.
>>>> 
>>>> This is not good.
>>> 
>>> What's not good about this?  1 is by design and is correct behaviour.  2 should not affect anything, since the new tx is completed at the end of the PFER invocation.
>> 
>> 2 does not affect anything but seems wasteful to me. Why start a transaction when I don't need one?
> 
> I do not understand the need for #1 to happen given that a running
> transaction already exist. In the case of a replicated cache, that
> transaction exists remotely on all other nodes.
> Thus why not apply the put under the same scope ?

It's an optimisation since we're reading from the external source of data. This optimisation avoids locking issues as a result of trying to read the same piece of data into the cache when executed concurrently in different nodes.

Read the putForExternal read javadoc to find out more.

> It also raises the question about #2, the object being added to the
> cache belongs to a specific scope/transaction and is somewhat isolated
> from the rest (transaction isolation), if putting it into the cache
> causes it to be "committed" every time, does it not challenge the
> isolation and make the whole transactionality moot?

Again, we're reading from the source of data and putting it in the cache. 

The source of data is the database which contains committed data, so no probs here.

> Also, as pointed out by Galder, it is fairly costly as the new
> transaction started and committed causes: extra messages for
> replication AND new transactions to be created and committed in the
> DB. In my case it makes the whole process (non-replicated vs
> replicated) about 130% slower per application transaction (many more
> "auto transactions")…

It'd be interesting to know if this drop in performance is purely down to this, or there are more factors involved.

> Finally, in case of failure to replicate or to put in the cache, it is
> my belief that the choice of handling the error/exception should be
> left to the application. e.g If the replication fails (synchronous
> replication), or the put the cache fails and no roll back is called on
> the application transaction, who can tell the state of the system?
> When testing I had many cases where the replication failed to complete
> and the main transaction I had completed. It means objects in
> different states may end up in the system...
> 
> Let me know if I am making false assumptions here or if I am not clear.
> 
> Thank you,
> 
> Regards,
> 
> Slorg1
> 
> -- 
> Please consider the environment - do you really need to print this email ?
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache




More information about the infinispan-dev mailing list