Event handling behaviour
by Pedro Ruivo
Hi guys.
Re: https://issues.jboss.org/browse/ISPN-2090
Summary: in tx caches, when @listener(sync = true), the callback is done
by the same thread as the transaction. However, most of the events may
not cause any problem, some of them are trigger when the transaction is
in the state that cannot attach more commands.
So, I want to discuss here what would be the correct behaviour/logic for
the callback (note this only affects tx caches) assuming that the
callback invokes some operation (get/put/etc.) over the cache.
IMO, all the synchronous events should suspend the transaction. This
way, we ensure that the operation is not attached to the transaction,
the callback has the same behaviour if sync=true and sync=false and also
with the events triggered in remote nodes (you will never read any
non-committed data from a remote transaction neither attach the operation).
Mircea suggestion is the following:
I think the correct logic should be:
* if isPre() then the value should not be seen when reading the cache
* if !isPre() then value should be seen. Not totally sure, but the
transaction as return by TransactionManager.getTransaction should not be
visible (i.e. suspend it before notifying the listener), as this would
result in different behavior on local/remote nodes that get notified.
Any other suggestion?
Cheers,
Pedro
11 years, 2 months
Re: [infinispan-dev] [infinispan-internal] Continuous Queries
by Mircea Markus
let's keep this on -dev.
On Oct 17, 2013, at 6:24 PM, Sanne Grinovero <sanne(a)redhat.com> wrote:
> ----- Original Message -----
>>
>> On Oct 17, 2013, at 2:28 PM, Sanne Grinovero <sanne(a)redhat.com> wrote:
>>
>>>
>>>
>>> ----- Original Message -----
>>>> On Oct 17, 2013, at 1:31 PM, Sanne Grinovero <sanne(a)redhat.com> wrote:
>>>>
>>>>> With some custom coding it's certainly possible to define an event
>>>>> listener
>>>>> which triggers when an entry is inserted/removed which matches a certain
>>>>> Query.
>>>>
>>>> where would hold the the query result? a cache perhaps?
>>>
>>> Why do you need to hold on to the query result?
>>> I was thinking to just send an event "newly stored X matches query Q1".
>>
>> You don't have a single process receive all the notifications then, but
>> multiple processes in the cluster. It's up to the user to aggregate these
>> results (that's why I mentioned a cache) but without aggregation this
>> feature is pretty limiting.
>
> I have no idea if it's limiting. For the use case I understood, that's pretty decent.
Here's my understanding of CQ[1]: a user queries a cache 10000000( you add the rest of 0) per second.
Instead of executing the query every time (very resource consuming) the system caches the query result, update it when underlying data gets modified, and return to the user on every invocation. Optionally you can register a listener on the query result, but that's just API sugar.
>
>
>>> You could register multiple such listeners, getting the effect of "newly
>>> stored entry X matches Query set {Q1, Q3, Q7}"
>>
>> The listeners would not be collocated.
>
> I'm not going to implement distributed listeners, I indeed expect you to register such a listener on each node.
If I run a query, continuous or not, I'd expect to be able to get all the result set of that query on the process on which I invoke it. Call me old fashion :-)
>
> I can show how to make Continous Queries on the Query API to accomplish this.
I wouldn't name the problem your solution solve Continuous Query :-)
> Anything else is out of scope for me :-) Technically I think it's out of scope for Infinispan too, it should delegate to a message bus.
-1, for the reasons mentioned above.
[1] http://coherence.oracle.com/display/COH31UG/Continuous+Query
11 years, 2 months
Is anyone else experiencing JGRP-1675
by Radim Vansa
Hi,
since Infinispan moved to JGroups 3.4, we're experiencing occassional
deadlocks in some tests - most of threads that send anything over
JGroups are waiting in JGroups' FlowControl.decrementCredits. The
problem sometimes goes away after several seconds, but it produces some
ugly spikes in our througput/response time charts. Originally this
affected just some RadarGun tests but this is appearing in some
client-server tests as well (we've recently investigated an issue where
this appeared in a regular soak test).
I was looking into that [1] for some time but haven't really figured out
the cause. The workaround is to set up MFC and UFC credits high enough
(I use 10M) and stuff works then. I was trying to reproduce that on pure
JGroups, but unsuccessfully.
I am not asking anyone to dig into that, but I wanted to know whether QA
is alone experiencing that or if there are more of us.
Radim
[1] https://issues.jboss.org/browse/JGRP-1675
--
Radim Vansa <rvansa(a)redhat.com>
JBoss DataGrid QA
11 years, 2 months
The windup of 6.0.0
by Mircea Markus
Hi guys,
- 6.0.0.CR2 was added for 16 Oct (Adrian) and 6.0.0.Final was moved to 23 Oct (Dan)
- we have some 20% performance regressions we need to look at before going final
- I've updated JIRA:
- added tasks for creating documentation and quickstarts
- some JIRAs were moved here
- please follow the JIRA or let me know if there's anything missing: http://goo.gl/y4Ky7t
Cheers,
--
Mircea Markus
Infinispan lead (www.infinispan.org)
11 years, 2 months
Re: [infinispan-dev] [infinispan-internal] Continuous Queries
by Mircea Markus
This is a general question so moving it to infinispan-dev.
On Oct 17, 2013, at 1:13 PM, Divya Mehra <dmehra(a)redhat.com> wrote:
> Continuous Queries is a question we often get from JDG prospects.
>
> With Querying expected to be fully supported in Library mode and Events/Listeners already available in Library mode, is there a way an application developer would achieve a continuous query in Library mode by writing a snippet of code leveraging Querying and Listeners?
No. The mechanism behind continuos query is "clustered listeners": instead of running the same query 1000 times/sec, you build the query result once and update it during every cache insert/delete/update.
>
> That is, a feasible path to achieve this functionality in JDG 6.2 via some custom coding, even though it is not the most efficient path (because Continuous Queries are not available out of the box).
You can always repeatedly query the cache, but that's not exactly continuos query.
>
> Thanks,
> Divya
Cheers,
--
Mircea Markus
Infinispan lead (www.infinispan.org)
11 years, 2 months
ISPN-3557: interactions between a clear() operation and a Transaction
by Sanne Grinovero
I'd love to brainstorm about the clear() operation and what it means
on Infinispan.
I'm not sure to what extent, but it seems that clear() is designed to
work in a TX, or even create an implicit transaction if needed, but
I'm not understanding how that can work.
Obviously a clear() operation isn't listing all keys explicitly. Which
implies that it's undefined on which keys it's going to operate when
it's fired.. that seems like terribly wrong in a distributed key/value
store as we can't possibly freeze the global state and somehow define
a set of keys which are going to be affected, while an explicit
enumeration is needed to acquire the needed locks.
It might give a nice safe feeling that, when invoking a clear()
operation in a transaction, I can still abort the transaction to make
it cancel the operation; that's the only good part I can think of: we
can cancel it.
I don't think it has anything to do with consistency though? To make
sure you're effectively involving all replicas of all entries in a
consistent way, a lock would need to be acquired on each affected key,
which again implies a need to enumerate all keys, including the
unknown keys which might be hiding in a CacheStore: it's not enough to
broadcast the clear() operation to all nodes and have them simply wipe
their local state as that's never going to deal correctly
(consistently) with in-flight transactions working on different nodes
at different times (I guess enabling Total Order could help but you'd
need to make it mandatory).
So let's step back a second and consider what is the use case for
clear() ? I suspect it's primarily a method needed during testing, or
maybe cleanup before a backup is restored (operations), maybe a
manually activated JMX operation to clear the cache in exceptional
cases.
I don't think there would ever be a need for a clear() operation to
interact with other transactions, so I'd rather make it illegal to
invoke a clear() inside a transaction, or simply ignore the
transactional scope and have an immediate and distributed effect.
I'm likely missing something. What terrible consequences would this have?
Cheers,
Sanne
11 years, 2 months
Re: [infinispan-dev] [Cloudtm-discussion] Transactional Distributed B+Tree over ISPN
by Mark Little
FYI I presented on the current state of cloud-TM at HPTS a week or so ago and there was much interest. I pointed people at our website.
Sent from my iPad
On 3 Oct 2013, at 18:16, Paolo Romano <romano(a)inesc-id.pt> wrote:
> Hi all,
>
> even the Cloud-TM project is officially over, we thought to share with you one of our last efforts, which unfortunately were a tad too late to make it into the submitted version of the platform (and deliverables etc).
>
> This is a scalable, distributed transactional index (B+tree) over ISPN, which combines a number of optimizations (in areas like data locality, concurrency, load balancing/elastic scaling) and builds over previous work (in particular, GMU [1] and Bumper [2]) that made it possible to achieve linear scalability up to 100 VMs even in update intensive workloads.
>
> Hot features:
> - at most 1 remote data access per each index operation thanks to :
> i) transaction migration,
> ii) combined use of full and partial replication (transparent and self-tuning depending on cluster size),
> iii) optimized data placement via customi hash functions
> - almost total avoidance of data contention thanks to the exploitation of commutativity operations on the index (via dirty reads and delayed actions)
> - it's built directly on top of ISPN (it does not depend on Fenix, unlike the collections' implementation that were used, e.g., by GeoGraph ).
>
> Details in the attached paper!
>
> We believe that this index implementation could be something generally useful for the ISPN community, especially given all the recent efforts in the areas of query. On the other hand, we should point out that the current implementation [3]:
> i) depends on transactional features (transaction migration, dirty reads, delayed actions) that have not been integrated in the official version of ISPN;
> ii) has been for the moment implemented as a Radargun extension, i.e. no effort was spent to modularize it/polish its API.
>
> ...so it would take some effort to have it fully integrated in the master version of ISPN.... but you know the saying: no pain no gain ;-)
>
> We'd love to hear your feedback of course!
>
> Nuno & Paolo
>
> [1] Sebastiano Peluso, Pedro Ruivo, Paolo Romano, Francesco Quaglia, and Luis Rodrigues,When Scalability Meets Consistency: Genuine Multiversion Update Serializable Partial Data Replication, 32nd International Conference on Distributed Computing Systems (ICDCS 2012)
>
> [2] Nuno Diegues and Paolo Romano,Bumper: Sheltering Transactions from Conflicts,The 32th IEEE Symposium on Reliable Distributed Systems (SRDS 2013), Braga, Portugal, Oct. 2013
>
> [3] https://github.com/cloudtm/sti-bt
> <STI-BT-report.pdf>
> ------------------------------------------------------------------------------
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
> _______________________________________________
> Cloudtm-discussion mailing list
> Cloudtm-discussion(a)lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/cloudtm-discussion
11 years, 2 months
DistributedExecutorSVC.submitEverywhere(callableTask) -- dispatching task > 1x (??)
by ben.cotton
We have an ISPN 5.1.6 data grid (that executes on top of JGroups) that
includes the following topology:
2 x Linux host(s)
Each with 30 x Java VM Nodes
TOTAL = 60 Nodes
On this 60 Node grid we use the
org.infinispan.distexec.DistributedExecutorService and
org.inifinispan.util.concurrent.NotifyingFuture
APIs to manage the dispatch of a MapReduce TASK that originates from a
dedicated TASK SENDER Node and targets the full set of 60 TASK RECEIVER
Nodes to complete the computation.
The exact API invoke (from the Task SENDER) – of course – looks like
//build the DistributedExecutorService and Callable instance
references
* List<Future<T>> futureList =
distExecSvc.submitEverywhere(ourCallableTask);*
Now, as expected, 99+% of the time we are able to realize exactly 1 Task
being distributed to all 60 RECEIVER Nodes and we see exactly 1 Future List
entry being returned per Node submitted.
However, under very rare circumstances … (and *only* when a certain subset
of RECEIVER Nodes are enduring a major GC event) we are able to see
undeniable evidence that the callableTask is being submitted /multiple
times/ to a certain subset of the RECEIVER Nodes.
Is there any ISPN/JGroups API or configuration mechanism by which we can be
assured of being able to prevent the callableTask being submitted multiple
times to a certain subset of the RECEIVER Nodes?
Thanks for any insights,
Ben
Ben D. Cotton III
J.P.Morgan
Liquidity Risk Technology
277 Park Ave Desk 08-GG64
New York, NY 10172-0003
212.622.5010
ben.cotton(a)jpmorgan.com
--
View this message in context: http://infinispan-developer-list.980875.n3.nabble.com/DistributedExecutor...
Sent from the Infinispan Developer List mailing list archive at Nabble.com.
11 years, 2 months
Running stress tests on CI ?
by Sanne Grinovero
Hi all,
the following change introduced a critical issue in the Lucene Directory:
final Set<String> filesList = fileOps.getFileList();
- String[] array = filesList.toArray(new String[0]);
- return array;
+ return filesList.toArray(new String[filesList.size()]);
I'll leave it as a puzzler to figure why the change is able to cause trouble ;-)
This generates a NPE in just a single second of running one of the
stress tests or performance tests, but I'm guilty of not being able to
make a normal unit test for this case.
That module contains such limited code, that in the very rare
occasions in which I apply some changes I re-run the included
benchmarks; I realize I can't expect that from all of you, so..
Should we enable some stress tests on CI?
As a side warning consequence of this, the Lucene Directory in release
6.0.0.CR1 is very unreliable [ISPN-3592].
--Sanne
11 years, 2 months