[infinispan-issues] [JBoss JIRA] (ISPN-2156) Benchmark and blog about a fast method of loading data into Infinispan

Manik Surtani (JIRA) jira-events at lists.jboss.org
Thu Jul 19 06:06:07 EDT 2012


    [ https://issues.jboss.org/browse/ISPN-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12706723#comment-12706723 ] 

Manik Surtani commented on ISPN-2156:
-------------------------------------

This should be on our docs as well.
                
> Benchmark and blog about a fast method of loading data into Infinispan  
> ------------------------------------------------------------------------
>
>                 Key: ISPN-2156
>                 URL: https://issues.jboss.org/browse/ISPN-2156
>             Project: Infinispan
>          Issue Type: Task
>            Reporter: Mircea Markus
>            Assignee: Vladimir Blagojevic
>              Labels: docs
>             Fix For: 5.2.0.FINAL
>
>
> To summarise:
> When using distributed caches, when we need to batch-load a set of data into the cluster inserting batches of keys that map to the same node should significantly increase the performance.
> Why?
> during the prepare phase each node receives the 
> complete list of modifications in that transaction and not only the 
> modification pertaining to it.
> E.g. say we have the following key->node mapping:
> {code}
> k1 -> A
> k2 -> B
> k3 -> C
> {code}
> Where k1, k2 and k3 are keys; A, B and C are nodes.
> If Tx1 writes (k1,k2,k3) then during the prepare A,B and C will receive 
> the the same package containing all the modification - namely (k1, 
> k2,k3). There are several reasons for doing this (apparently) 
> unoptimized approach: serialize the prepare only once, better handling 
> of recovery information.
> Now if you group transactions/batches base on key distribution the amount of redundant traffic is significantly reduced - and that translates in better performance especially when the datasets 
> you're inserting is quite high.
> This JIRA is basically about benchmarking and blogging about this approach.
> A entry in the FAQ would be helpful as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.jboss.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        


More information about the infinispan-issues mailing list