[infinispan-dev] Major version cleaning

Radim Vansa rvansa at redhat.com
Tue Feb 21 08:20:00 EST 2017


On 02/21/2017 09:39 AM, Sanne Grinovero wrote:
> On 21 February 2017 at 07:37, Dan Berindei <dan.berindei at gmail.com> wrote:
>> On Mon, Feb 20, 2017 at 8:02 PM, Sanne Grinovero <sanne at infinispan.org> wrote:
>>> -1 to batch removal
>>>
>>> Frankly I've always been extremely negative about the fact that
>>> batches are built on top of transactions. It's easy to find several
>>> iterations of rants of mine on this mailing list, especially fierce
>>> debates with Mircea. So I'd welcome a separation of these features.
>>>
>>> Yet, removing batching seems very wrong. I disagree that people should
>>> use Transactions instead; for many use cases it's overkill, and for
>>> others - and this is the main pain point I always had with the current
>>> design - it might make sense to have a batch (or more than one) within
>>> a transaction.
>>> I have had practical problems with needing to "flush" a single batch
>>> within a transaction as the size of the combined elements was getting
>>> too large. That doesn't imply at all that I'm ready to commit.
>>>
>> WDYM by "flush" here? I have a feeling this is nothing like our
>> batching ever was...
> I'm just listing points about why incorporating the batch concept with
> transactions is not practical.
>
> I mean that when one has to write very large amounts of data, you need to
> break it up in smaller batches; *essentially* to flush the first batch before
> starting the second one.
> So that is unrelated with atomicity and consistency, as in practice one has
> to split one business operation into multiple batches.

You haven't explained what "flush" means. Since you separate that from 
atomicity/consistency, I assume that batches on non-tx cache are just 
ordered putOrRemoveAll operations, immediately visible on flush without 
any atomicity. If you want batches in transactions, you probably mean 
that the changes are not visible until tx commits. So you are worried 
that all data modified within this transaction won't fit into single 
node memory? Or is there more to that? What's a "nested batch", then?
I could see some benefits in removing repeatable-reads cached values.

>>> @Pedro: the fact that some code is broken today is not relevant, when
>>> there's need for such features. Like you suggest, it had bad premises
>>> (build it on TX) so we should address that, but not throw it all out.
>>>
>> Infinispan never created nested batches:
> I know. I'm not describing the current implementation, I'm describing use
> cases which are not addressed, or in which Infinispan is clumsy to use,
> to suggest improvements.
> And I'm not asking to have nested batches. Again, just pointing
> out practical problems with the current batch design.
>
> It should be possible to run an efficient batch of operations within a
> transaction.
> Or a sequence of batches, all within the same transaction.
> N.B. you could see that as a "nested batch" in the current twisted idea that
> a batch is a transaction - which is what I'm arguing against.
>
>
>> calling startBatch() when a
>> batch was already associated with the current thread just incremented
>> a refcount, and only the final endBatch() did any work. OTOH running a
>> batch within a transaction always worked very much like suspending the
>> current transaction, starting a new one, and committing it on
>> endBatch(). So the only real difference between batching and using
>> DummyTransactionManager is that batching is limited to one cache's
>> operations, while DummyTransactionManager supports multiple resources.
>>
>>
>>> @Bela is making spot-on objections based on use cases, which need to
>>> be addressed in some way. The "atomical" operations still don't work
>>> right[1] in Infinispan and that's a big usability problem.
>>>
>> Batching never was about sending updates asynchronously. We have
>> putAllAsync() for that, which doesn't need transactions, and it's even
>> slightly more efficient without transactions.
> If you think this way, it sounds like you give no value to the application
> performance: people need more than a couple put operations.

Removes?

I am really asking out of curiosity, I hope these questions don't sound 
ironically.

R.


>
>> And atomical operations have bugs, yes, but I'm not sure how
>> implementing a new kind of batching that isn't based on transactions
>> would help with that.
>>
>>
>>> +1 to remove async TX
>>>
>>> I actually like the concept but the API should be different.. might as
>>> well remove this for now.
>>>
>>> +1 to remove the Tree module
>>>
>>> I personally never used it as you all advised against, yet it's been
>>> often requested by users; sometimes explicitly and others not so
>>> explicit, yet people often have need which would be solved by a good
>>> Tree module.
>>> I understand the reasons you all want to remove it: it's buggy. But I
>>> believe the real reason is that it should have been built on top of a
>>> proper batch API, and using real MVCC. In that case it wouldn't have
>>> been buggy, nor too hard to maintain, and might have attracted way
>>> more interest as well.
>> I think the fact that we haven't been able to build a "proper" batch
>> API using real MVCC yet is a proof to the contrary...
>>
>>
>>> I'd remove it as a temporary measure: delete the bad stuff, but
>>> hopefully it could be reintroduced built on better principles in some
>>> future?
>>>
>>> Thanks,
>>> Sanne
>>>
>>> [1] - "right" as in user expectations and actual practical use, which
>>> is currently different than in the twisted definition of "right" that
>>> the team has been using as an excuse ;-) I'll also proof that it
>>> doesn't work right according to your own twisted specs, when I find
>>> the time to finish some tests..
>>>
>>>
>>> On 20 February 2017 at 16:48, Pedro Ruivo <pedro at infinispan.org> wrote:
>>>>
>>>> On 20-02-2017 16:12, Bela Ban wrote:
>>>>>
>>>>> On 20/02/17 17:06, Tristan Tarrant wrote:
>>>>>> Hi guys, we discussed about this a little bit in the past and this
>>>>>> morning on IRC. Here are some proposed removals:
>>>>>>
>>>>>> - Remove the async transactional modes, as they are quite pointless
>>>>>> - Remove batching: users should use transactions
>>>>> How do you make a bunch of modifications and send them asynchronously if
>>>>> both batching *and* async TXs are getting removed?
>>>> We are not removing features, we are removing broken code.
>>>>
>>>> Batching is using transactions and async transactions doesn't make sense
>>>> since infinispan has to report to TransactionManager.
>>>>
>>>> Our current asyn-tx is broken in a way that is starts to commit and
>>>> reports OK to the transaction manager. if you have a topology change or
>>>> a conflict, you will end with inconsistent data.
>>>>
>>>> So, why do we keeping this code around?
>>>>
>>>> you can still move a transaction commit to another thread if you don't
>>>> wanna wait for its commit:
>>>>
>>>> tm.begin()
>>>> doWork()
>>>> tx = tm.suspend()
>>>> new_thread {
>>>>     tm.resume(tx)
>>>>     tm.commit()
>>>> }
>>>>
>>>> The best thing I can think of is to keep the batching API and
>>>> re-implement it to provide an endBatchAsync() that will do the above.
>>>>
>>>>> So if someone wants to apply a unit of work *atomically* (either all
>>>>> modifications within that unit of work are applied, or none), what
>>>>> alternatives exist?
>>>>>
>>>>>> - Remove the tree module: it doesn't work properly, and uses batching
>>>>>>
>>>>>> Please cast your votes
>>>>>>
>>>>>> Tristan
>>>>>>
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev at lists.jboss.org
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


-- 
Radim Vansa <rvansa at redhat.com>
JBoss Performance Team



More information about the infinispan-dev mailing list