[infinispan-dev] Self-tuning data placement in Infinispan
Paolo Romano
romano at inesc-id.pt
Mon Nov 19 07:21:46 EST 2012
Hi all,
in attach I'm sending the last of our efforts, a paper in which we
describe the design and implementation of AutoPlacer, a system for
self-tuning the placement of data in Infinispan (distribution mode). I
thought of sharing this paper with you, as I had already mentioned our
results in this area during the previous Cloud-TM project's meetings.
In a nutshell, Autoplacer combines two main innovative contributions:
- a lightweight round-based distributed optimizer. In each round, each
node uses a space-efficient top-k algorithm to analyze the stream of
data accesses and determine, in an approximate way, their own "hot
spots" (i.e., the data items that are generating most remote accesses)
and their corresponding access frequencies. The access statistics on the
top-k most accessed data items in the system are then scattered among
the nodes in the system, which solve a linear programming optimization
problem to determine the optimal placement for these data.
- a novel data structure, that we named Probabilistic Associative Array
(PAA). PAAs expose an API similar to classic Associative Arrays, but can
provide erroneous results, when queried, with a user tunable probabilty.
Internally, PAA use probabilistic techniques (namely, bloom filters +
decision-tree classifiers) to minimize the amount of memory required for
their encoding. Autoplacer uses PAAs to minimize the overhead associated
with maintating, and updating, the information concerning the placement
of data items across the nodes constituting an Infinispan cluster.
The paper has been submitted to an international conference and is
currently under review.
Comments are more than welcome of course!
Cheers,
Paolo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: autoplacer.pdf
Type: application/x-acroread
Size: 324050 bytes
Desc: not available
Url : http://lists.jboss.org/pipermail/infinispan-dev/attachments/20121119/505abce7/attachment-0001.bin
More information about the infinispan-dev
mailing list