Our process was pretty straight forward. We (ok, I) picked memcache for
three reasons:
1) Automatic expiration of records means I don't have to do my own
garbage collection.
2) AWS offered ElastiCache which was built off of Memcache and could
allow us to move to Couchbase if needed (it's not)
3) Our Ops folk suggested Memcache because they've had the least number
of problems dealing with it.
Since I started working on things, a few other things have happened:
1) AWS now offers a managed Redis service
2) AWS became a bit more reliable in a subsequent release. (Haven't done
a lot of rigorous checks there, since most of the problems come about
because Redis REALLY doesn't like it when memory gets full.)
3) Some of the storage requirements changed.
Ah, the joys of initial implementations of a service. This is why
abstraction is a dear, sweet friend of mine.
Honestly, the thing I'm having the hardest time trying to work out is
the optimal way that records should be stored and accessed in the
various data stores. As anyone with a link to
http://www.eecs.berkeley.edu/~rcs/research/interactive_latency.html can
tell you, pulling stuff off a wire is one of the worst things you can
do. So, the question is "Do you pull lots of data in one block, or lots
of blocks of little data?" Either way, there's going to be a factor of
inefficiency. (e.g. someone who has a lot of CHIDs with only a one or
two very active CHIDs would have a pretty good sized block of extra data
pumped back and forth in the case of a single record organized by UAID,
vs. someone with a lot of very active CHIDs having to do lots of little
updates.)
I'm not really going to try to address it right now, since it's
premature optimization without actual use metrics, and there be both
dragons and hairy yaks there, but I tend to like having options on hand
for when things start to go wonky.
I'll be interested to see what y'all pick, since your constraints and
benefits aren't going to be the same.
On 2013/9/25 5:16 AM, Daniel Bevenius wrote:
To get a feel for what would be involved to use a key/value
datastore
we've done some experimenting with Redis[1]. There might be other
non-relational databases more suited or perhaps Redis is a good choice
for this, I don't know. But I think we should decide if this is worth
pursuing and in that case what database to use before spending more
time on this.
Let us know what you think.
[1]
https://gist.github.com/danbev/6606289#using-redis-as-a-data-store
On 19 September 2013 18:08, JR Conlin <jrconlin(a)gmail.com
<mailto:jrconlin@gmail.com>> wrote:
On 2013/9/19 5:18 AM, Lucas Holmquist wrote:
>
> On Sep 19, 2013, at 12:34 AM, Daniel Bevenius
> <daniel.bevenius(a)gmail.com <mailto:daniel.bevenius@gmail.com>>
wrote:
>
>> >I wonder what kind of numbers would we get by ditching JPA
>> completely and using a non-relational DB like Redis
>> Yeah, I think we will most likely need to if we want to come
>> close to the other implementations performance wise. Others use
>> Memcache and I've seen MongoDB in use as well.
>>
>> Perhaps I should just add performance tests for the rest of the
>> SimplePush operations so that we have them covered and then look
>> into using a non-relational DB. Once that is done we can revisit
>> this performance task.
>> What do people thing about that?
>
> +1, relational DB's are dinosaours
Hardly. It's just a question of what the right tool for a given
job is. (I'll note that Google is spending quite a bit of time and
effort improving Maria because they use a LOT of relational DBs
for very large data.
In this case, however, it's pretty easy to reduce things to simple
key/value. I picked Memcache, partly because of the fact that it
does record auto-expiration, which means that I don't have to do
garbage collection on uncollected records. If you switched to an
alternate schema (such as keeping a single record per UAID that
contained all the CHID data as well as stuff like the proprietary
info or other crap), you could even use simple flat files and skip
the DB requirement altogether.
We were kicking the idea around of only storing undeliverable data
into the DB, and instead letting each websock connector deal with
managing it's own data. For our implementation, I've already
preferenced delivery over storage for connected clients and seen a
fair bit of improvement on delivery. (Remember, SimplePush is not
a 100% guaranteed delivery system, so please avoid using it for
nuclear reactor management or pacemakers.)
We'll probably hold off on doing further memory refinement until
we get some actual use data, but I like having options available.
>
>>
>>
>>
>>
>> On 19 September 2013 06:03, Bruno Oliveira <bruno(a)abstractj.org
>> <mailto:bruno@abstractj.org>> wrote:
>>
>> Hmmm tempting idea :)
>>
>> > On Sep 19, 2013, at 12:23 AM, Douglas Campos <qmx(a)qmx.me
>> <mailto:qmx@qmx.me>> wrote:
>> >
>> > That's a nice report!
>> >
>> > I wonder what kind of numbers would we get by ditching JPA
>> completely
>> > and using a non-relational DB like Redis...
>> >
>> > --
>> > qmx
>> > _______________________________________________
>> > aerogear-dev mailing list
>> > aerogear-dev(a)lists.jboss.org
>> <mailto:aerogear-dev@lists.jboss.org>
>> >
https://lists.jboss.org/mailman/listinfo/aerogear-dev
>> _______________________________________________
>> aerogear-dev mailing list
>> aerogear-dev(a)lists.jboss.org
>> <mailto:aerogear-dev@lists.jboss.org>
>>
https://lists.jboss.org/mailman/listinfo/aerogear-dev
>>
>>
>> _______________________________________________
>> aerogear-dev mailing list
>> aerogear-dev(a)lists.jboss.org <mailto:aerogear-dev@lists.jboss.org>
>>
https://lists.jboss.org/mailman/listinfo/aerogear-dev
>
>
>
> _______________________________________________
> aerogear-dev mailing list
> aerogear-dev(a)lists.jboss.org <mailto:aerogear-dev@lists.jboss.org>
>
https://lists.jboss.org/mailman/listinfo/aerogear-dev
_______________________________________________
aerogear-dev mailing list
aerogear-dev(a)lists.jboss.org <mailto:aerogear-dev@lists.jboss.org>
https://lists.jboss.org/mailman/listinfo/aerogear-dev
_______________________________________________
aerogear-dev mailing list
aerogear-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/aerogear-dev