Le 3 déc. 2009 à 19:39, Manik Surtani a écrit :
On 3 Dec 2009, at 18:28, philippe van dyck wrote:
> Thanks for your answer!
>
> My comments are in the txt ;-)
>
> phil
>
> Le 3 déc. 2009 à 19:07, Manik Surtani a écrit :
>
>>
>> On 3 Dec 2009, at 09:21, philippe van dyck wrote:
>>
>>>
>>> Le 3 déc. 2009 à 09:49, Sanne Grinovero a écrit :
>>>
>>>> Hello,
>>>> From this blogpost:
>>>>
http://infinispan.blogspot.com/2009/08/coalesced-asynchronous-cache-store...
>>>> I had understood that any store can be configured async? (I never
>>>> tested it, I might be wrong.)
>>>
>>> This asynchronism is happening on another layer, and it is not helping.
>>
>> How does this not help? Your client threads should not block on writing to S3.
>
> They are not blocking, they are executed synchronously... so it takes years ;-)
Not blocking == executed synchronously?? Are we losing something in
translation/terminology here? :)
Oups, excuse my french.
Not Blocking -> does not wait for (I meant not 'blocked')
Asynchronous implies not blocking but does not mean that each payload will not be executed
one after the other (i.e. in a queue).
So what I meant, is, when I was searching for the source of the abysmal performance, I
discovered that the S3 CacheStore was processing each 'write' one after the
other.
It is not the fault of the CacheStore, its interface does not offer any asynchronism (but
we know that now)
> JClouds provides an httpnio module with very efficient multithreading (and non
blocking sockets).
> But the current implementation of the S3 CacheStore writes synchronously (ask
Adrian)!
Yes, I agree this needs to be re-done - not just to use httpnio, but also to support
other cloud storage providers beyond just S3.
I provided a workaround patch to adrian, it will allow a much better throughput for 4.0,
if you apply it... (Adrian was thinking about an option to specify, maybe in the unsafe
settings?)
>>> Right now, the cache access is synchronous (org.infinispan.Cache extends
ConcurrentMap)
>>>
>>> If you enable it, the cache updates are delegated to an asynch store
decorator, queuing updates for periodic flushing.
>>> When this decorator flushes updates to the actual cache store, they are
currently flushed in a batch (transaction like).
>>> This batch is then sent to the cache store... and emptied synchronously.
>>
>> Yes, and this does not affect your client threads since a separate executor is
used to process the flushing.
>
> IMHO, It is up to JClouds to manage multithreading, not to a separate executor.
> The actual exact multithreading efficiency is managed by the latest layer, the one in
contact with the source of the latency.
> If you directly use JClouds Future<> interface, your executor is useless
because it will be rescheduled by jclouds (re queued and re de queued by another worker
thread probably in httpnio).
Yes, in which case you won't use the cache store async flag. But this should be
something that only makes sense once the "CloudCacheStore" is written to use
JClouds' BlobStore.
The current JClouds interface (not the next one, the blobstore) currently offers a
Future<> interface.
IMHO, the cache store async flag may be a bad naming when you provide an async interface.
What you will actually be doing is to pause the stream of updates until the async worker
thread decides to open the door to the real CacheStore.
So, what about calling it the "batch" (or latch?) option instead of the
"async" one ?
>>> If you want to implement an asynchronous interface, you have to implement it
through all the layers or it will be "resynchronized".
>>>
>>> The cache store interface does not offer an asynchronous (Future<>)
interface...
>>>
>>> IMHO, if you want to allow asynchronous access, you have to begin down there,
and push it up to the AdvancedCache interface.
>>
>> We have this already - see Cache.putAsync() for example - but this only offloads
network calls to an executor queue when you have already configured the cache to use
synchronous clustering. As for cache store events, this still happens in the manner
configured (sync or async/batched), although there could be scope for using this API to
offload cache store operations as well.
>
> ... What is the need for an async 'batching' executor delegate store, when
you already have an async interface ?
>
> Do you think it will be possible to connect the Futures ? The ones produced on the s3
cache store level (Jclouds), to the ones received from Cache.putAsync ?
Yes.
http://fisheye.jboss.org/browse/Infinispan/trunk/core/src/main/java/org/i...
How can I access that in the CacheStore interface ? Could you please take a look, with
adrian, to the patch I provided ? It could really use this !
> If I understand the current situation, asynchronicity is everywhere ... but
unconnected ? Heeeeelp, I need help ;-)
Hehh! :) Yeah, I suppose I now see the need for an AsyncCacheStore interface - but
let's put this on the shelf for a moment, as we need to think about how this affects
other cache store impls that are not implicitly asynchronous.
No problem here, let's postpone to 4.0+! Amazing job done already...
Let's start first with the CloudCacheStore [1]. If you know JClouds well, do you
feel like implementing this with the latest blob store APIs?
I would like to, but I really have no resources for that right now, sorry. And I have a
qi4j EntityStore to finish first ;-)
Thanks again for the help, I will do some EC2 load test this week and if I can incriminate
Infinispan, you (all) be first to know ;-)
cheers,
phil
Cheers
Manik
[1]
https://jira.jboss.org/jira/browse/ISPN-298
> Cheers,
>
> phil
>
>>
>>> The real added value of a completely asynchronous system is to apply it on
the system level, like the HTTP NIO libraries or a Multithreaded layer (OS level like
Grand Central / OpenCL).
>>>
>>> Cheers,
>>>
>>> Philippe
>>>
>>> P.S.: We are only talking about updates to the cache, a complete asynchronous
system will answer with a Future<> to a read, or get(), request on the cache.
>>>
>>>
>>>>
>>>> If you have to develop this, wouldn't it make sense to decouple the
>>>> async properties from the low level implementation? This should be
>>>> part of core, not the specific impl, so that it can be reused.
>>>>
>>>> comments from core developers?
>>>>
>>>> Regards,
>>>> Sanne
>>>>
>>>> 2009/12/3 philippe van dyck <pvdyck(a)gmail.com>:
>>>>> Adrian, Elias,
>>>>> I think I read somewhere on the list that the blobstore version of
jclouds
>>>>> upgrade has been postponed to the next version of Infinispan (4.1?).
>>>>> Do you see any added value, to the S3 cache store, in upgrading ASAP
?
>>>>> I'll catch you later on #jclouds to send you the
'asynchronous'
>>>>> modifications (but we are on a different TZ).
>>>>> Elias knows about a previous (jboss cache?) asynch version of the
cache
>>>>> store interface but I don't see any trace of it in the Infinispan
release
>>>>> (was the jboss cache already using Futures ?).
>>>>> And you are right Elias, HTTP performance problems are, in part,
solved with
>>>>> concurrency, connection pools and the like, usually managed by the
http
>>>>> client itself (like httpnio).
>>>>> Regarding the S3 'bucket' full of cache store
'buckets', there is indeed a
>>>>> naming problem.
>>>>> But I am quite sure this one returns all the content of the cache :
>>>>> public Set<Bucket> values() throws S3ConnectionException {
>>>>> Set<Bucket> buckets = new HashSet<Bucket>();
>>>>> for (Map.Entry<String, InputStream> entry : map.entrySet()) {
>>>>> buckets.add(bucketFromStream(entry.getKey(), entry.getValue()));
>>>>> }
>>>>> return buckets;
>>>>> }
>>>>> And values() is called by loadAllLockSafe() itself called by
performRehash()
>>>>> itself called when a cache leaves a distributed cluster.
>>>>> But when you take a closer look at the way the results are used...
there are
>>>>> a lot of optimizations to be done!
>>>>> One solution, in performRehash, is only to iterate on keys and fetch
the
>>>>> value later...
>>>>> Anyway, I hope this kind of fine tuning will appear in the 4.1
version ;-)
>>>>> Cheers,
>>>>> philippe
>>>>>
>>>>> Le 2 déc. 2009 à 19:35, Adrian Cole a écrit :
>>>>>
>>>>> Hi, Phillipe.
>>>>>
>>>>> Firstly, thanks for taking effort on this. If you have some time,
I
>>>>> would love for you to integrate the trunk version of jclouds as we
>>>>> have significantly better logging. I'd also love to see your
patch on
>>>>> current effort. Seems you know what you are doing :)
>>>>>
>>>>> Second, there is an unfortunate overloading of the term bucket.
>>>>> Bucket exists as an infinispan concept, which means all objects that
>>>>> share the same hashcode. That said, I believe it is the
loadOnStartup
>>>>> option that uses the values() thing, and not for general purpose
>>>>> join/leave.
>>>>>
>>>>> Third, this was indeed done with the synch cache store api. I'm
not
>>>>> sure is an async one is present, but you can see we can easily
>>>>> integrate with such.
>>>>>
>>>>> I'd be glad to chat more on #jclouds on freenode, if you have
some time.
>>>>>
>>>>> Cheers and excellent work.
>>>>> -Adrian
>>>>>
>>>>> On Wed, Dec 2, 2009 at 10:00 AM, philippe van dyck
<pvdyck(a)gmail.com> wrote:
>>>>>
>>>>> Hi Adrian and thanks for your answer.
>>>>>
>>>>> I dug into the source of Infinispan S3 cache store and, no offense,
but it
>>>>>
>>>>> looks more like a proof of concept than something I could use in
production.
>>>>>
>>>>> First of all, in order to achieve a minimum of efficiency, we need to
use
>>>>>
>>>>> concurrency in this specific cache store.
>>>>>
>>>>> Since it is using the JCloud's Map interface and not the
Future<>
>>>>>
>>>>> asynchronous one... well, you know that, you wrote a big part of
JClouds ;-)
>>>>>
>>>>> The CacheStore interface does not offer an asynchronous solution, but
a
>>>>>
>>>>> workaround is available.
>>>>>
>>>>> I just modified the S3 cache store, and every write operation is now
>>>>>
>>>>> asynchronous and the resulting future is stored in a ThreadLocal
queue.
>>>>>
>>>>> After each transaction (S3CacheStore.applyModifications) I empty the
queue
>>>>>
>>>>> and wait for each Future to finish, in order to catch errors (and
allow
>>>>>
>>>>> rollbacks... or else the whole transaction mechanism is useless).
>>>>>
>>>>> The drawback is obvious, if you don't use a transaction manager
to update
>>>>>
>>>>> the cache, exceptions will die silently (but come on, nobody does
that ;-).
>>>>>
>>>>> The solution is working and I updated 1000 entries in ... 20 seconds
(for me
>>>>>
>>>>> it means 'mission accomplished').
>>>>>
>>>>> Secondly, there are still a couple of very strange things happening
in the
>>>>>
>>>>> S3 cache store, but the most intriguing one is
>>>>>
>>>>> definitely JCloudsBucket public Set<Bucket> values().
>>>>>
>>>>> Is it really serious ? Must we be able to load *ALL* of our data in
order to
>>>>>
>>>>> rehash on some cluster join/leave operation ?
>>>>>
>>>>> I plan to store a couple of 10's of GB on S3 so... well you see
the problem.
>>>>>
>>>>> It seems especially problematic since I was planning to use
Amazon's EC2
>>>>>
>>>>> autoscale feature to add Infinispan instances to my
'previously' working
>>>>>
>>>>> cluster.
>>>>>
>>>>> I am quite sure I misunderstood something, or maybe all the rest.
>>>>>
>>>>> Any help most welcome.
>>>>>
>>>>> Philippe
>>>>>
>>>>> Le 2 déc. 2009 à 18:19, Adrian Cole a écrit :
>>>>>
>>>>> Hi, phillipe.
>>>>>
>>>>> Apologies about the differences in trunk not being in infinispan,
yet.
>>>>>
>>>>> Obviously that version would help narrow down what's going on.
>>>>>
>>>>> when you mentioned this: "it is abysmal (when using httpclient
>>>>>
>>>>> directly I had a min of 100/sec using 20 connections)."
>>>>>
>>>>> Are you saying you are comparing to code that is writing to S3 via
>>>>>
>>>>> httpclient apis?
>>>>>
>>>>> S3 tends to have a 80ms minimum overhead on PUT commands as measured
>>>>>
>>>>> by
hostedftp.com [1]
>>>>>
>>>>> Our normal perf tests from trunk do about 100 concurrent puts in 2-3
>>>>>
>>>>> seconds untuned to s3 without the use of nio.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> -Adrian
>>>>>
>>>>>
>>>>> [1]
>>>>>
>>>>>
http://hostedftp.wordpress.com/2009/06/30/hostedftp-amazon-aws-s3-perform...
>>>>>
>>>>> On Wed, Dec 2, 2009 at 6:06 AM, Bela Ban <bban(a)redhat.com>
wrote:
>>>>>
>>>>> OK, then someone from the Infinispan team needs to help you.
>>>>>
>>>>> If the option to use write-behind (instead of write-through) for a
cache
>>>>>
>>>>> loader still exists, that might be a perf boost.
>>>>>
>>>>> The basic issue with the S3 cache loader is that it needs to send
slow
>>>>>
>>>>> and bulky SOAP messages to S3, and that's always slow. I
don't know the
>>>>>
>>>>> current S3 cache loader impl, but I suggest take a look and see what
>>>>>
>>>>> properties they support. E.g. they might have an option to batch
updates
>>>>>
>>>>> and write them to S3 in collected form.
>>>>>
>>>>>
>>>>> philippe van dyck wrote:
>>>>>
>>>>> Thanks for your help Bela.
>>>>>
>>>>> Indeed, when I replace the S3 cache store with the disk one, the
performance
>>>>>
>>>>> problem disappears (takes less than a second to store my 100
'put' updates
>>>>>
>>>>> when I commit the transaction).
>>>>>
>>>>> Here is my config :
>>>>>
>>>>> <?xml version="1.0" encoding="UTF-8"?>
>>>>>
>>>>> <infinispan
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>>>>>
>>>>> xmlns="urn:infinispan:config:4.0">
>>>>>
>>>>> <global>
>>>>>
>>>>> <transport
>>>>>
>>>>>
>>>>>
transportClass="org.infinispan.remoting.transport.jgroups.JGroupsTransport">
>>>>>
>>>>> <properties>
>>>>>
>>>>> <property
name="configurationFile"
>>>>>
>>>>> value="jgroups.xml" />
>>>>>
>>>>> </properties>
>>>>>
>>>>> </transport>
>>>>>
>>>>> </global>
>>>>>
>>>>> <default>
>>>>>
>>>>> <transaction
>>>>>
>>>>>
>>>>>
transactionManagerLookupClass="org.infinispan.transaction.lookup.DummyTransactionManagerLookup"
>>>>>
>>>>> />
>>>>>
>>>>> <clustering mode="distribution">
>>>>>
>>>>> <l1 enabled="true"
lifespan="100000" />
>>>>>
>>>>> <hash numOwners="2"
rehashRpcTimeout="120000" />
>>>>>
>>>>> </clustering>
>>>>>
>>>>> <loaders passivation="false"
shared="true" preload="false">
>>>>>
>>>>> <loader
class="org.infinispan.loaders.s3.S3CacheStore"
>>>>>
>>>>> fetchPersistentState="false"
>>>>>
>>>>> ignoreModifications="false"
>>>>>
>>>>> purgeOnStartup="false">
>>>>>
>>>>> <properties>
>>>>>
>>>>> <property
name="awsAccessKey"
>>>>>
>>>>> value="xxx" />
>>>>>
>>>>> <property
name="awsSecretKey"
>>>>>
>>>>> value="xxx" />
>>>>>
>>>>> <property
name="bucketPrefix"
>>>>>
>>>>> value="store" />
>>>>>
>>>>> </properties>
>>>>>
>>>>> <async enabled="true"/>
>>>>>
>>>>> </loader>
>>>>>
>>>>> </loaders>
>>>>>
>>>>> <unsafe unreliableReturnValues="true" />
>>>>>
>>>>> </default>
>>>>>
>>>>> </infinispan>
>>>>>
>>>>> And the log :
>>>>>
>>>>> INFO (14:54:37): JGroupsTransport - Starting JGroups
Channel
>>>>>
>>>>> INFO (14:54:38): JChannel - JGroups version:
2.8.0.CR5
>>>>>
>>>>> -------------------------------------------------------------------
>>>>>
>>>>> GMS: address=sakapuss.local-16157, cluster=Infinispan-Cluster,
physical
>>>>>
>>>>> address=192.168.1.136:7800
>>>>>
>>>>> -------------------------------------------------------------------
>>>>>
>>>>> INFO (14:54:49): JGroupsTransport - Received new cluster
view:
>>>>>
>>>>> [sakapuss.local-16157|0] [sakapuss.local-16157]
>>>>>
>>>>> INFO (14:54:49): JGroupsTransport - Cache local address
is
>>>>>
>>>>> sakapuss.local-16157, physical address is 192.168.1.136:7800
>>>>>
>>>>> INFO (14:54:49): GlobalComponentRegistry - Infinispan version:
>>>>>
>>>>> Infinispan 'Starobrno' 4.0.0.CR2
>>>>>
>>>>> INFO (14:54:49): AsyncStore - Async cache loader
starting
>>>>>
>>>>> org.infinispan.loaders.decorators.AsyncStore@7254d7ac
>>>>>
>>>>> WARN (14:54:51): utureCommandConnectionPool -
>>>>>
>>>>>
org.jclouds.http.httpnio.pool.HttpNioFutureCommandConnectionPool@fcd4eca1 -
>>>>>
>>>>> saturated connection pool
>>>>>
>>>>> INFO (14:54:52): ComponentRegistry - Infinispan version:
>>>>>
>>>>> Infinispan 'Starobrno' 4.0.0.CR2
>>>>>
>>>>> please note the HttpNioFutureCommandConnectionPool@fcd4eca1 -
saturated
>>>>>
>>>>> connection pool (??)
>>>>>
>>>>>
>>>>> Philippe
>>>>>
>>>>>
>>>>> Le 2 déc. 2009 à 14:28, Bela Ban a écrit :
>>>>>
>>>>>
>>>>> Just to narrow down the issue: when you disable the S3 cache store,
I
>>>>>
>>>>> assume the performance problem goes away, correct ?
>>>>>
>>>>> Just trying to pin the blame on the S3 cache loader, then I don't
even
>>>>>
>>>>> need to see whether it is a JGroups problem... :-)
>>>>>
>>>>>
>>>>>
>>>>> philippe van dyck wrote:
>>>>>
>>>>> Hi Infinispan mailing list,
>>>>>
>>>>> a couple of days ago, I succeeded in writing an entity store for
qi4j
>>>>>
>>>>> (
http://www.qi4j.org/) using Infinispan, the S3 store and the
S3_PING
>>>>>
>>>>> JGroups clustering configuration.
>>>>>
>>>>> It works like a charm, discovers new EC2 instances, synchronizes and
process
>>>>>
>>>>> transactions perfectly... you did an amazing job.
>>>>>
>>>>> But I have a serious performance problems.
>>>>>
>>>>> When I write an update (<1k) to the cache, it takes around 500 ms
to be
>>>>>
>>>>> stored on S3.
>>>>>
>>>>> The best result I achieved was around 10 cache writes per second...
it is
>>>>>
>>>>> abysmal (when using httpclient directly I had a min of 100/sec using
20
>>>>>
>>>>> connections).
>>>>>
>>>>> When I commit a JTA transaction made of 100 cache writes, it takes
around 30
>>>>>
>>>>> seconds (cpu<5%) and the first write ends on S3 after at least 5
seconds of
>>>>>
>>>>> 'idle' time (SSL negotiation??).
>>>>>
>>>>> I disabled the store asynchronism and work without JTA transactions,
no
>>>>>
>>>>> effect on performance.
>>>>>
>>>>> I also modified the jClouds configuration, multiplying by 10 all
worker
>>>>>
>>>>> threads, connections and the rest... no improvement!
>>>>>
>>>>> When I (load) test my web app (wicket based+qi4j+...infinispan) the
cpu
>>>>>
>>>>> stays idle (<5%) and ... JTA transactions fails (timeouts) and I
cannot
>>>>>
>>>>> acquire locks before the 10 seconds timeout.
>>>>>
>>>>> Is there something fishy in the jclouds configuration ? in the
httpnio use
>>>>>
>>>>> of jclouds ? in the version of jclouds (the trunk one with the blob
store
>>>>>
>>>>> seems to be so different) ?
>>>>>
>>>>> Am I missing something ?
>>>>>
>>>>> Any pointer to any doc/help/experience is welcome ;-)
>>>>>
>>>>> Philippe
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>>
>>>>> infinispan-dev mailing list
>>>>>
>>>>> infinispan-dev(a)lists.jboss.org
>>>>>
>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Bela Ban
>>>>>
>>>>> Lead JGroups / Clustering Team
>>>>>
>>>>> JBoss
>>>>>
>>>>> _______________________________________________
>>>>>
>>>>> infinispan-dev mailing list
>>>>>
>>>>> infinispan-dev(a)lists.jboss.org
>>>>>
>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
------------------------------------------------------------------------
>>>>>
>>>>> _______________________________________________
>>>>>
>>>>> infinispan-dev mailing list
>>>>>
>>>>> infinispan-dev(a)lists.jboss.org
>>>>>
>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>>> --
>>>>>
>>>>> Bela Ban
>>>>>
>>>>> Lead JGroups / Clustering Team
>>>>>
>>>>> JBoss
>>>>>
>>>>> _______________________________________________
>>>>>
>>>>> infinispan-dev mailing list
>>>>>
>>>>> infinispan-dev(a)lists.jboss.org
>>>>>
>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>>
>>>>> infinispan-dev mailing list
>>>>>
>>>>> infinispan-dev(a)lists.jboss.org
>>>>>
>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>>
>>>>> infinispan-dev mailing list
>>>>>
>>>>> infinispan-dev(a)lists.jboss.org
>>>>>
>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev(a)lists.jboss.org
>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> infinispan-dev(a)lists.jboss.org
>>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>>
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> infinispan-dev(a)lists.jboss.org
>>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> infinispan-dev(a)lists.jboss.org
>>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> --
>> Manik Surtani
>> manik(a)jboss.org
>> Lead, Infinispan
>> Lead, JBoss Cache
>>
http://www.infinispan.org
>>
http://www.jbosscache.org
>>
>>
>>
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev(a)lists.jboss.org
>>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
--
Manik Surtani
manik(a)jboss.org
Lead, Infinispan
Lead, JBoss Cache
http://www.infinispan.org
http://www.jbosscache.org
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev