[infinispan-dev] Infinispan S3 Cluster + Store Success ! (+serious performance problems)

Manik Surtani manik at jboss.org
Thu Dec 3 13:02:56 EST 2009


On 3 Dec 2009, at 15:40, Adrian Cole wrote:

> Philippe,
> 
> Very good catch on rehash.  This seems to imply a fault in the
> bucket-based system, or my binding to it.
> 
> Guys,
> 
> Am I missing something, or are all bucket-based cache-stores
> susceptible to loading all data on rehash?

I need to have a look at this... 

> 
> Cheers,
> -Adrian
> 
> On Thu, Dec 3, 2009 at 12:20 AM, philippe van dyck <pvdyck at gmail.com> wrote:
>> Adrian, Elias,
>> I think I read somewhere on the list that the blobstore version of jclouds
>> upgrade has been postponed to the next version of Infinispan (4.1?).
>> Do you see any added value, to the S3 cache store, in upgrading ASAP ?
>> I'll catch you later on #jclouds to send you the 'asynchronous'
>> modifications (but we are on a different TZ).
>> Elias knows about a previous (jboss cache?) asynch version of the cache
>> store interface but I don't see any trace of it in the Infinispan release
>> (was the jboss cache already using Futures ?).
>> And you are right Elias, HTTP performance problems are, in part, solved with
>> concurrency, connection pools and the like, usually managed by the http
>> client itself (like httpnio).
>> Regarding the S3 'bucket' full of cache store 'buckets', there is indeed a
>> naming problem.
>> But I am quite sure this one returns all the content of the cache :
>> public Set<Bucket> values() throws S3ConnectionException {
>> Set<Bucket> buckets = new HashSet<Bucket>();
>> for (Map.Entry<String, InputStream> entry : map.entrySet()) {
>> buckets.add(bucketFromStream(entry.getKey(), entry.getValue()));
>> }
>> return buckets;
>> }
>> And values() is called by loadAllLockSafe() itself called by performRehash()
>> itself called when a cache leaves a distributed cluster.
>> But when you take a closer look at the way the results are used... there are
>> a lot of optimizations to be done!
>> One solution, in performRehash, is only to iterate on keys and fetch the
>> value later...
>> Anyway, I hope this kind of fine tuning will appear in the 4.1 version ;-)
>> Cheers,
>> philippe
>> 
>> Le 2 déc. 2009 à 19:35, Adrian Cole a écrit :
>> 
>> Hi, Phillipe.
>> 
>> Firstly,  thanks for taking effort on this.  If you have some time, I
>> would love for you to integrate the trunk version of jclouds as we
>> have significantly better logging.  I'd also love to see your patch on
>> current effort.  Seems you know what you are doing :)
>> 
>> Second, there is an unfortunate overloading of the term bucket.
>> Bucket exists as an infinispan concept, which means all objects that
>> share the same hashcode.  That said, I believe it is the loadOnStartup
>> option that uses the values() thing, and not for general purpose
>> join/leave.
>> 
>> Third, this was indeed done with the synch cache store api.  I'm not
>> sure is an async one is present,  but you can see we can easily
>> integrate with such.
>> 
>> I'd be glad to chat more on #jclouds on freenode, if you have some time.
>> 
>> Cheers and excellent work.
>> -Adrian
>> 
>> On Wed, Dec 2, 2009 at 10:00 AM, philippe van dyck <pvdyck at gmail.com> wrote:
>> 
>> Hi Adrian and thanks for your answer.
>> 
>> I dug into the source of Infinispan S3 cache store and, no offense, but it
>> 
>> looks more like a proof of concept than something I could use in production.
>> 
>> First of all, in order to achieve a minimum of efficiency, we need to use
>> 
>> concurrency in this specific cache store.
>> 
>> Since it is using the JCloud's Map interface  and not the Future<>
>> 
>> asynchronous one... well, you know that, you wrote a big part of JClouds ;-)
>> 
>> The CacheStore interface does not offer an asynchronous solution, but a
>> 
>> workaround is available.
>> 
>> I just modified the S3 cache store, and every write operation is now
>> 
>> asynchronous and the resulting future is stored in a ThreadLocal queue.
>> 
>> After each transaction (S3CacheStore.applyModifications) I empty the queue
>> 
>> and wait for each Future to finish, in order to catch errors (and allow
>> 
>> rollbacks... or else the whole transaction mechanism is useless).
>> 
>> The drawback is obvious, if you don't use a transaction manager to update
>> 
>> the cache, exceptions will die silently (but come on, nobody does that ;-).
>> 
>> The solution is working and I updated 1000 entries in ... 20 seconds (for me
>> 
>> it means 'mission accomplished').
>> 
>> Secondly, there are still a couple of very strange things happening in the
>> 
>> S3 cache store, but the most intriguing one is
>> 
>> definitely JCloudsBucket  public Set<Bucket> values().
>> 
>> Is it really serious ? Must we be able to load *ALL* of our data in order to
>> 
>> rehash on some cluster join/leave operation ?
>> 
>> I plan to store a couple of 10's of GB on S3 so... well you see the problem.
>> 
>> It seems especially problematic since I was planning to use Amazon's EC2
>> 
>> autoscale feature to add Infinispan instances to my 'previously' working
>> 
>> cluster.
>> 
>> I am quite sure I misunderstood something, or maybe all the rest.
>> 
>> Any help most welcome.
>> 
>> Philippe
>> 
>> Le 2 déc. 2009 à 18:19, Adrian Cole a écrit :
>> 
>> Hi, phillipe.
>> 
>> Apologies about the differences in trunk not being in infinispan, yet.
>> 
>> Obviously that version would help narrow down what's going on.
>> 
>> when you mentioned this: "it is abysmal (when using httpclient
>> 
>> directly I had a min of 100/sec using 20 connections)."
>> 
>> Are you saying you are comparing to code that is writing to S3 via
>> 
>> httpclient apis?
>> 
>> S3 tends to have a 80ms minimum overhead on PUT commands as measured
>> 
>> by hostedftp.com  [1]
>> 
>> Our normal perf tests from trunk do about 100 concurrent puts in 2-3
>> 
>> seconds untuned to s3 without the use of nio.
>> 
>> Cheers,
>> 
>> -Adrian
>> 
>> 
>> [1]
>> 
>> http://hostedftp.wordpress.com/2009/06/30/hostedftp-amazon-aws-s3-performance-report-how-fast-is-the-cloud/
>> 
>> On Wed, Dec 2, 2009 at 6:06 AM, Bela Ban <bban at redhat.com> wrote:
>> 
>> OK, then someone from the Infinispan team needs to help you.
>> 
>> If the option to use write-behind (instead of write-through) for a cache
>> 
>> loader still exists, that might be a perf boost.
>> 
>> The basic issue with the S3 cache loader is that it needs to send slow
>> 
>> and bulky SOAP messages to S3, and that's always slow. I don't know the
>> 
>> current S3 cache loader impl, but I suggest take a look and see what
>> 
>> properties they support. E.g. they might have an option to batch updates
>> 
>> and write them to S3 in collected form.
>> 
>> 
>> philippe van dyck wrote:
>> 
>> Thanks for your help Bela.
>> 
>> Indeed, when I replace the S3 cache store with the disk one, the performance
>> 
>> problem disappears (takes less than a second to store my 100 'put' updates
>> 
>> when I commit the transaction).
>> 
>> Here is my config :
>> 
>> <?xml version="1.0" encoding="UTF-8"?>
>> 
>> <infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>> 
>>       xmlns="urn:infinispan:config:4.0">
>> 
>>       <global>
>> 
>>               <transport
>> 
>> 
>> transportClass="org.infinispan.remoting.transport.jgroups.JGroupsTransport">
>> 
>>                       <properties>
>> 
>>                               <property name="configurationFile"
>> 
>> value="jgroups.xml" />
>> 
>>                       </properties>
>> 
>>               </transport>
>> 
>>       </global>
>> 
>>       <default>
>> 
>>               <transaction
>> 
>> 
>> transactionManagerLookupClass="org.infinispan.transaction.lookup.DummyTransactionManagerLookup"
>> 
>> />
>> 
>>               <clustering mode="distribution">
>> 
>>                       <l1 enabled="true" lifespan="100000" />
>> 
>>                       <hash numOwners="2" rehashRpcTimeout="120000" />
>> 
>>               </clustering>
>> 
>>               <loaders passivation="false" shared="true" preload="false">
>> 
>>                       <loader class="org.infinispan.loaders.s3.S3CacheStore"
>> 
>>                               fetchPersistentState="false"
>> 
>> ignoreModifications="false"
>> 
>>                               purgeOnStartup="false">
>> 
>>                               <properties>
>> 
>>                                       <property name="awsAccessKey"
>> 
>> value="xxx" />
>> 
>>                                       <property name="awsSecretKey"
>> 
>> value="xxx" />
>> 
>>                                       <property name="bucketPrefix"
>> 
>> value="store" />
>> 
>>                               </properties>
>> 
>>                               <async enabled="true"/>
>> 
>>                       </loader>
>> 
>>               </loaders>
>> 
>>               <unsafe unreliableReturnValues="true" />
>> 
>>       </default>
>> 
>> </infinispan>
>> 
>> And the log :
>> 
>> INFO  (14:54:37): JGroupsTransport           - Starting JGroups Channel
>> 
>> INFO  (14:54:38): JChannel                   - JGroups version: 2.8.0.CR5
>> 
>> -------------------------------------------------------------------
>> 
>> GMS: address=sakapuss.local-16157, cluster=Infinispan-Cluster, physical
>> 
>> address=192.168.1.136:7800
>> 
>> -------------------------------------------------------------------
>> 
>> INFO  (14:54:49): JGroupsTransport           - Received new cluster view:
>> 
>> [sakapuss.local-16157|0] [sakapuss.local-16157]
>> 
>> INFO  (14:54:49): JGroupsTransport           - Cache local address is
>> 
>> sakapuss.local-16157, physical address is 192.168.1.136:7800
>> 
>> INFO  (14:54:49): GlobalComponentRegistry    - Infinispan version:
>> 
>> Infinispan 'Starobrno' 4.0.0.CR2
>> 
>> INFO  (14:54:49): AsyncStore                 - Async cache loader starting
>> 
>> org.infinispan.loaders.decorators.AsyncStore at 7254d7ac
>> 
>> WARN  (14:54:51): utureCommandConnectionPool -
>> 
>> org.jclouds.http.httpnio.pool.HttpNioFutureCommandConnectionPool at fcd4eca1 -
>> 
>> saturated connection pool
>> 
>> INFO  (14:54:52): ComponentRegistry          - Infinispan version:
>> 
>> Infinispan 'Starobrno' 4.0.0.CR2
>> 
>> please note the HttpNioFutureCommandConnectionPool at fcd4eca1 - saturated
>> 
>> connection pool  (??)
>> 
>> 
>> Philippe
>> 
>> 
>> Le 2 déc. 2009 à 14:28, Bela Ban a écrit :
>> 
>> 
>> Just to narrow down the issue: when you disable the S3 cache store, I
>> 
>> assume the performance problem goes away, correct ?
>> 
>> Just trying to pin the blame on the S3 cache loader, then I don't even
>> 
>> need to see whether it is a JGroups problem... :-)
>> 
>> 
>> 
>> philippe van dyck wrote:
>> 
>> Hi Infinispan mailing list,
>> 
>> a couple of days ago, I succeeded in writing an entity store for qi4j
>> 
>> (http://www.qi4j.org/) using Infinispan, the S3 store and the S3_PING
>> 
>> JGroups clustering configuration.
>> 
>> It works like a charm, discovers new EC2 instances, synchronizes and process
>> 
>> transactions perfectly... you did an amazing job.
>> 
>> But I have a serious performance problems.
>> 
>> When I write an update (<1k) to the cache, it takes around 500 ms to be
>> 
>> stored on S3.
>> 
>> The best result I achieved was around 10 cache writes per second... it is
>> 
>> abysmal (when using httpclient directly I had a min of 100/sec using 20
>> 
>> connections).
>> 
>> When I commit a JTA transaction made of 100 cache writes, it takes around 30
>> 
>> seconds (cpu<5%) and the first write ends on S3 after at least 5 seconds of
>> 
>> 'idle' time (SSL negotiation??).
>> 
>> I disabled the store asynchronism and work without JTA transactions, no
>> 
>> effect on performance.
>> 
>> I also modified the jClouds configuration, multiplying by 10 all worker
>> 
>> threads, connections and the rest... no improvement!
>> 
>> When I (load) test my web app (wicket based+qi4j+...infinispan) the cpu
>> 
>> stays idle (<5%) and ... JTA transactions fails (timeouts) and I cannot
>> 
>> acquire locks before the 10 seconds timeout.
>> 
>> Is there something fishy in the jclouds configuration ? in the httpnio use
>> 
>> of jclouds ? in the version of jclouds (the trunk one with the blob store
>> 
>> seems to be so different) ?
>> 
>> Am I missing something ?
>> 
>> Any pointer to any doc/help/experience is welcome ;-)
>> 
>> Philippe
>> 
>> 
>> _______________________________________________
>> 
>> infinispan-dev mailing list
>> 
>> infinispan-dev at lists.jboss.org
>> 
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> 
>> --
>> 
>> Bela Ban
>> 
>> Lead JGroups / Clustering Team
>> 
>> JBoss
>> 
>> _______________________________________________
>> 
>> infinispan-dev mailing list
>> 
>> infinispan-dev at lists.jboss.org
>> 
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> 
>> 
>> 
>> ------------------------------------------------------------------------
>> 
>> _______________________________________________
>> 
>> infinispan-dev mailing list
>> 
>> infinispan-dev at lists.jboss.org
>> 
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> --
>> 
>> Bela Ban
>> 
>> Lead JGroups / Clustering Team
>> 
>> JBoss
>> 
>> _______________________________________________
>> 
>> infinispan-dev mailing list
>> 
>> infinispan-dev at lists.jboss.org
>> 
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> 
>> _______________________________________________
>> 
>> infinispan-dev mailing list
>> 
>> infinispan-dev at lists.jboss.org
>> 
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> 
>> _______________________________________________
>> 
>> infinispan-dev mailing list
>> 
>> infinispan-dev at lists.jboss.org
>> 
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
> 
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
manik at jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org








More information about the infinispan-dev mailing list