Hi Adrian and thanks for your answer.
I dug into the source of Infinispan S3 cache store and, no offense, but it looks more like
a proof of concept than something I could use in production.
First of all, in order to achieve a minimum of efficiency, we need to use concurrency in
this specific cache store.
Since it is using the JCloud's Map interface and not the Future<> asynchronous
one... well, you know that, you wrote a big part of JClouds ;-)
The CacheStore interface does not offer an asynchronous solution, but a workaround is
available.
I just modified the S3 cache store, and every write operation is now asynchronous and the
resulting future is stored in a ThreadLocal queue.
After each transaction (S3CacheStore.applyModifications) I empty the queue and wait for
each Future to finish, in order to catch errors (and allow rollbacks... or else the whole
transaction mechanism is useless).
The drawback is obvious, if you don't use a transaction manager to update the cache,
exceptions will die silently (but come on, nobody does that ;-).
The solution is working and I updated 1000 entries in ... 20 seconds (for me it means
'mission accomplished').
Secondly, there are still a couple of very strange things happening in the S3 cache store,
but the most intriguing one is definitely JCloudsBucket public Set<Bucket>
values().
Is it really serious ? Must we be able to load *ALL* of our data in order to rehash on
some cluster join/leave operation ?
I plan to store a couple of 10's of GB on S3 so... well you see the problem.
It seems especially problematic since I was planning to use Amazon's EC2 autoscale
feature to add Infinispan instances to my 'previously' working cluster.
I am quite sure I misunderstood something, or maybe all the rest.
Any help most welcome.
Philippe
Le 2 déc. 2009 à 18:19, Adrian Cole a écrit :
Hi, phillipe.
Apologies about the differences in trunk not being in infinispan, yet.
Obviously that version would help narrow down what's going on.
when you mentioned this: "it is abysmal (when using httpclient
directly I had a min of 100/sec using 20 connections)."
Are you saying you are comparing to code that is writing to S3 via
httpclient apis?
S3 tends to have a 80ms minimum overhead on PUT commands as measured
by
hostedftp.com [1]
Our normal perf tests from trunk do about 100 concurrent puts in 2-3
seconds untuned to s3 without the use of nio.
Cheers,
-Adrian
[1]
http://hostedftp.wordpress.com/2009/06/30/hostedftp-amazon-aws-s3-perform...
On Wed, Dec 2, 2009 at 6:06 AM, Bela Ban <bban(a)redhat.com> wrote:
> OK, then someone from the Infinispan team needs to help you.
>
> If the option to use write-behind (instead of write-through) for a cache
> loader still exists, that might be a perf boost.
>
> The basic issue with the S3 cache loader is that it needs to send slow
> and bulky SOAP messages to S3, and that's always slow. I don't know the
> current S3 cache loader impl, but I suggest take a look and see what
> properties they support. E.g. they might have an option to batch updates
> and write them to S3 in collected form.
>
>
> philippe van dyck wrote:
>> Thanks for your help Bela.
>>
>> Indeed, when I replace the S3 cache store with the disk one, the performance
problem disappears (takes less than a second to store my 100 'put' updates when I
commit the transaction).
>>
>> Here is my config :
>>
>> <?xml version="1.0" encoding="UTF-8"?>
>>
>> <infinispan
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>> xmlns="urn:infinispan:config:4.0">
>> <global>
>> <transport
>>
transportClass="org.infinispan.remoting.transport.jgroups.JGroupsTransport">
>> <properties>
>> <property name="configurationFile"
value="jgroups.xml" />
>> </properties>
>> </transport>
>>
>> </global>
>>
>> <default>
>> <transaction
>>
transactionManagerLookupClass="org.infinispan.transaction.lookup.DummyTransactionManagerLookup"
/>
>> <clustering mode="distribution">
>> <l1 enabled="true" lifespan="100000"
/>
>> <hash numOwners="2"
rehashRpcTimeout="120000" />
>> </clustering>
>>
>> <loaders passivation="false" shared="true"
preload="false">
>> <loader
class="org.infinispan.loaders.s3.S3CacheStore"
>> fetchPersistentState="false"
ignoreModifications="false"
>> purgeOnStartup="false">
>>
>> <properties>
>> <property name="awsAccessKey"
value="xxx" />
>> <property name="awsSecretKey"
value="xxx" />
>> <property name="bucketPrefix"
value="store" />
>> </properties>
>> <async enabled="true"/>
>> </loader>
>> </loaders>
>>
>> <unsafe unreliableReturnValues="true" />
>>
>> </default>
>> </infinispan>
>>
>> And the log :
>>
>> INFO (14:54:37): JGroupsTransport - Starting JGroups Channel
>> INFO (14:54:38): JChannel - JGroups version: 2.8.0.CR5
>>
>> -------------------------------------------------------------------
>> GMS: address=sakapuss.local-16157, cluster=Infinispan-Cluster, physical
address=192.168.1.136:7800
>> -------------------------------------------------------------------
>> INFO (14:54:49): JGroupsTransport - Received new cluster view:
[sakapuss.local-16157|0] [sakapuss.local-16157]
>> INFO (14:54:49): JGroupsTransport - Cache local address is
sakapuss.local-16157, physical address is 192.168.1.136:7800
>> INFO (14:54:49): GlobalComponentRegistry - Infinispan version: Infinispan
'Starobrno' 4.0.0.CR2
>> INFO (14:54:49): AsyncStore - Async cache loader starting
org.infinispan.loaders.decorators.AsyncStore@7254d7ac
>> WARN (14:54:51): utureCommandConnectionPool -
org.jclouds.http.httpnio.pool.HttpNioFutureCommandConnectionPool@fcd4eca1 - saturated
connection pool
>> INFO (14:54:52): ComponentRegistry - Infinispan version: Infinispan
'Starobrno' 4.0.0.CR2
>>
>> please note the HttpNioFutureCommandConnectionPool@fcd4eca1 - saturated
connection pool (??)
>>
>>
>> Philippe
>>
>>
>> Le 2 déc. 2009 à 14:28, Bela Ban a écrit :
>>
>>
>>> Just to narrow down the issue: when you disable the S3 cache store, I
>>> assume the performance problem goes away, correct ?
>>>
>>> Just trying to pin the blame on the S3 cache loader, then I don't even
>>> need to see whether it is a JGroups problem... :-)
>>>
>>>
>>>
>>> philippe van dyck wrote:
>>>
>>>> Hi Infinispan mailing list,
>>>>
>>>> a couple of days ago, I succeeded in writing an entity store for qi4j
(
http://www.qi4j.org/) using Infinispan, the S3 store and the S3_PING JGroups clustering
configuration.
>>>>
>>>> It works like a charm, discovers new EC2 instances, synchronizes and
process transactions perfectly... you did an amazing job.
>>>>
>>>> But I have a serious performance problems.
>>>>
>>>> When I write an update (<1k) to the cache, it takes around 500 ms to
be stored on S3.
>>>>
>>>> The best result I achieved was around 10 cache writes per second... it is
abysmal (when using httpclient directly I had a min of 100/sec using 20 connections).
>>>>
>>>> When I commit a JTA transaction made of 100 cache writes, it takes around
30 seconds (cpu<5%) and the first write ends on S3 after at least 5 seconds of
'idle' time (SSL negotiation??).
>>>>
>>>> I disabled the store asynchronism and work without JTA transactions, no
effect on performance.
>>>>
>>>> I also modified the jClouds configuration, multiplying by 10 all worker
threads, connections and the rest... no improvement!
>>>>
>>>> When I (load) test my web app (wicket based+qi4j+...infinispan) the cpu
stays idle (<5%) and ... JTA transactions fails (timeouts) and I cannot acquire locks
before the 10 seconds timeout.
>>>>
>>>> Is there something fishy in the jclouds configuration ? in the httpnio
use of jclouds ? in the version of jclouds (the trunk one with the blob store seems to be
so different) ?
>>>>
>>>> Am I missing something ?
>>>>
>>>> Any pointer to any doc/help/experience is welcome ;-)
>>>>
>>>> Philippe
>>>>
>>>>
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev(a)lists.jboss.org
>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>
>>>>
>>> --
>>> Bela Ban
>>> Lead JGroups / Clustering Team
>>> JBoss
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev(a)lists.jboss.org
>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> --
> Bela Ban
> Lead JGroups / Clustering Team
> JBoss
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev