[hibernate-dev] [infinispan-dev] Infinispan tx, config and multithreading

Manik Surtani manik at jboss.org
Fri Aug 14 05:30:21 EDT 2009


On 14 Aug 2009, at 10:17, Łukasz Moreń wrote:

> Yes, but i.e. FSDirectory flushes changes if any file descriptor is  
> created/updated - can be many in one IndexWriter life.
> In infinispan case implementation, I want to commit changes only  
> when IndexWriter is closing - batch all modifications.
> If I switch to transaction per descriptor modification - similarly  
> how it's done in FSDirectory it works well, however not efficient.

So what's expensive here?  Writing to Infinispan, or the indexing  
itself?  Correct me if I am wrong, I assume that the IndexWriter  
creates multiple threads, and each thread does: {
	// some indexing work
	// write these indexes to Infinispan
}

Is that correct?

>
> 2009/8/14 Sanne Grinovero <sanne.grinovero at gmail.com>
> I am not an expert on this part of Lucene, but it looks like to me
> that the IndexWriter is the "driver/coordinator", and it's decisions
> are affected by a pluggable MergeScheduler; they do stuff on the
> internal buffers of the IndexWriter (dequeue the pending segments to
> be written to the index), but it shouldn't matter what they exactly do
> as the internal status of these classes are unaffected by our
> transactions.
> They take some decision about writing segments to the Directory and
> committing changes ("sync()") : as you implement this Directory you
> should only have to take care of this class, I don't think the
> MergeScheduler(s) are relevant: it just happens that the thread going
> to apply changes to the index might be a different one than the one
> pushing changes to the IndexWriter.
>
> In the Directory implementation you should use transactions to push
> state changes to the "underlying storage": as FSDirectory is playing
> with file descriptors and flushes, you do the same with Infinispan
> transactions.
>
> 2009/8/14 Łukasz Moreń <lukasz.moren at gmail.com>:
> > Yes, right, MergeSchedulers.
> >
> > 2009/8/14 Sanne Grinovero <sanne.grinovero at gmail.com>
> >>
> >> what are these "other" threads? Are you speaking about the
> >> MergeSchedulers?
> >>
> >> 2009/8/13 Łukasz Moreń <lukasz.moren at gmail.com>:
> >> > IndexWriter processes index update and delegates some job to  
> other
> >> > threads and waits when they finish. These "other" threads works  
> on
> >> > data modified
> >> > in IndexWriter transaction. So I think if I use transaction per
> >> > thread, "others" would not see data modified by IndexWriter until
> >> > commit.
> >> >
> >> > 2009/8/13, Emmanuel Bernard <emmanuel at hibernate.org>:
> >> >> Ah I thought it was using multiple threads because of your mass
> >> >> indexing. I did not know some threads were span specifically  
> for the
> >> >> Infinispan directory.
> >> >>
> >> >> On 13 août 09, at 17:34, Sanne Grinovero wrote:
> >> >>
> >> >>> Hi Łukasz,
> >> >>> what is your usage of these threads? did you consider using one
> >> >>> transaction per thread?
> >> >>>
> >> >>> Sanne
> >> >>>
> >> >>> 2009/8/13 Łukasz Moreń <lukasz.moren at gmail.com>:
> >> >>>> Newly created threads were not associated with any  
> transaction, so I
> >> >>>> suppose it was a problem. Sharing transaction between  
> threads seems
> >> >>>> to
> >> >>>> be a good solution.
> >> >>>> Thanks for help!
> >> >>>>
> >> >>>> 2009/8/13, Jason T. Greene <jason.greene at redhat.com>:
> >> >>>>> Correct. Also there could be read races as well, so if you  
> are
> >> >>>>> going to
> >> >>>>> share a tx between threads, i would use some shared lock to
> >> >>>>> gaurantee
> >> >>>>> that only one thread can use it at a time. BTW this means  
> you have
> >> >>>>> to
> >> >>>>> properly suspend/resume the TX via the TM API as well.
> >> >>>>>
> >> >>>>> Emmanuel Bernard wrote:
> >> >>>>>> Modifying a transaction means applying muations (like SQL  
> INSERT /
> >> >>>>>> UPDATE / DELETE) to the transactional resource?
> >> >>>>>>
> >> >>>>>> On 13 août 09, at 15:07, Jason T. Greene wrote:
> >> >>>>>>
> >> >>>>>>> When using transactions, the context is bound to the
> >> >>>>>>> transaction, and
> >> >>>>>>> you can move a transaction between threads. However, you  
> should
> >> >>>>>>> only
> >> >>>>>>> be modifying a transaction with one thread at a time.
> >> >>>>>>>
> >> >>>>>>> Emmanuel Bernard wrote:
> >> >>>>>>>> Could it be that you are not using the same transaction  
> between
> >> >>>>>>>> different threads (ie you physically start different  
> ones or
> >> >>>>>>>> different  "Infinispan contexts")?
> >> >>>>>>>> Infini guys, do you support transactional operation  
> spanning
> >> >>>>>>>> several
> >> >>>>>>>> concurrent threads?
> >> >>>>>>>> On 13 août 09, at 14:04, Łukasz Moreń wrote:
> >> >>>>>>>>> I've tried with JBoss AS transaction manager and
> >> >>>>>>>>> JBossStandaloneTM.
> >> >>>>>>>>> The result is this same in all cases - error during  
> merge.
> >> >>>>>>>>>
> >> >>>>>>>>> 2009/8/12, Emmanuel Bernard <emmanuel at hibernate.org>:
> >> >>>>>>>>>> Ok I understand better now.
> >> >>>>>>>>>> Do your tests in JBoss AS with it's decent transaction  
> manager
> >> >>>>>>>>>> (infinispan should have a config for it)
> >> >>>>>>>>>> For unit testing, force the indexing process in  
> hibernate to
> >> >>>>>>>>>> use a
> >> >>>>>>>>>> single thread (I ghnk it's possible ask Sanne of you  
> don't
> >> >>>>>>>>>> know how).
> >> >>>>>>>>>>
> >> >>>>>>>>>> Exposing some configuration to infinispan makes sense.  
> can you
> >> >>>>>>>>>> start a
> >> >>>>>>>>>> thread explainig what is configurable and which one  
> you think
> >> >>>>>>>>>> we
> >> >>>>>>>>>> should expose to hsearch users. Ideally I would like  
> to offer
> >> >>>>>>>>>> one or
> >> >>>>>>>>>> two defaut config scenarios and allow to fallback to a  
> custom
> >> >>>>>>>>>> config.
> >> >>>>>>>>>>
> >> >>>>>>>>>> Emmanuel
> >> >>>>>>>>>>
> >> >>>>>>>>>> On 12 août 2009, at 11:58, Łukasz Moreń
> >> >>>>>>>>>> <lukasz.moren at gmail.com>
> >> >>>>>>>>>> wrote:
> >> >>>>>>>>>>
> >> >>>>>>>>>>> Sorry, but my wifi does not work well today. I will  
> try to
> >> >>>>>>>>>>> explain
> >> >>>>>>>>>>> it more clear.
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> I'm using DummyTransactionManager available for  
> Infinispan.
> >> >>>>>>>>>>> It associates transaction with the calling thread.
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> Steps to update index:
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> 1. index writer acquires lock - begin of transaction
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> 2. if it is necessary, index writer delegates new  
> threads to
> >> >>>>>>>>>>> do
> >> >>>>>>>>>>> merge work.
> >> >>>>>>>>>>> Those merge threads do not see changes made so far from
> >> >>>>>>>>>>> begin of
> >> >>>>>>>>>>> transaction,
> >> >>>>>>>>>>> and are looking for segments which are not yet in  
> index.
> >> >>>>>>>>>>> Changes will be visible when AD.3 is completed.
> >> >>>>>>>>>>> For tests i tried to commit transaction when merge  
> starts
> >> >>>>>>>>>>> and then
> >> >>>>>>>>>>> everything worked well. But then i need to start it  
> again.
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> 3. index writer releases lock - transaction is  
> commited, all
> >> >>>>>>>>>>> changes
> >> >>>>>>>>>>> made in this transaction are visible for other threads.
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> Maybe using some other transaction manager could help?
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> What about Infinispan cache configuration? Some  
> configuration
> >> >>>>>>>>>>> mechanism should be exposed to the user,
> >> >>>>>>>>>>> or we can hardcoded one in  
> InfinispanDirectoryProvider is
> >> >>>>>>>>>>> enough?
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> 2009/8/12 Emmanuel Bernard <emmanuel at hibernate.org>
> >> >>>>>>>>>>> why?
> >> >>>>>>>>>>> Emmanuel Bernard
> >> >>>>>>>>>>> Pending
> >> >>>>>>>>>>> you there?
> >> >>>>>>>>>>> Emmanuel Bernard
> >> >>>>>>>>>>> Pending
> >> >>>>>>>>>>> Ok please describe in details what is going on. From  
> what
> >> >>>>>>>>>>> you are
> >> >>>>>>>>>>> describing the tx cannot see all segments which looks  
> like an
> >> >>>>>>>>>>> infinispan bug to me.
> >> >>>>>>>>>>> Pending
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> As a back up you can try wo transaction and see if  
> that works
> >> >>>>>>>>>>> Emmanuel Bernard
> >> >>>>>>>>>>> Pending
> >> >>>>>>>>>>> technically the lucene index should cope with that
> >> >>>>>>>>>>> Emmanuel Bernard
> >> >>>>>>>>>>> 11:16
> >> >>>>>>>>>>> but I like this approach less
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>> Let's try and chat by email IF I'm not online, I need  
> to run
> >> >>>>>>>>>>> on some
> >> >>>>>>>>>>> errands today.
> >> >>>>>>>>>>>
> >> >>>>>>>> _______________________________________________
> >> >>>>>>>> infinispan-dev mailing list
> >> >>>>>>>> infinispan-dev at lists.jboss.org
> >> >>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> --
> >> >>>>>>> Jason T. Greene
> >> >>>>>>> JBoss, a division of Red Hat
> >> >>>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> --
> >> >>>>> Jason T. Greene
> >> >>>>> JBoss, a division of Red Hat
> >> >>>>>
> >> >>>>
> >> >>>> _______________________________________________
> >> >>>> infinispan-dev mailing list
> >> >>>> infinispan-dev at lists.jboss.org
> >> >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >> >>>
> >> >>> _______________________________________________
> >> >>> hibernate-dev mailing list
> >> >>> hibernate-dev at lists.jboss.org
> >> >>> https://lists.jboss.org/mailman/listinfo/hibernate-dev
> >> >>
> >> >>
> >> >
> >
> >
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
manik at jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org




-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/hibernate-dev/attachments/20090814/55214f50/attachment.html 


More information about the hibernate-dev mailing list