In a cluster, you can't just have an atomic long starting at 0
that you
increase everytime a node is modified, because you could easily find
yourself in a situation where two nodes modifying the same key might
generate the same cas id.
Hmmm, in a cluster no mater who writes the key/vale, they have to wind up
modifying the same data, no? Otherwise the cluster becomes inconsistent.
That is the point, where, on modification, you can increment the counter.
Eviction, however, raises more challenges. Clearly, when a value is
evicted, you can't simply restart the counter when the data is rewritten.
This would confuse the clients. However, I also get uncomfortable putting
something time consuming into the main write path. I guess I am a bit
jaded from when System.currentTimeMillis() took a long time.
But then again:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6440250
Also note from this that sometimes nanoTime just returns currentTimeMillis*10^6.
Perhaps setting an initial value in a time consuming way, then a simple
increment thereafter. The goal is to get consistent, usable behaviour,
while maintaining performance.
We definitely shouldn't use completely independent CAS values for
different protocols.
Got to run,
Alex
--- On Fri, 12/11/09, Galder Zamarreno <galder(a)redhat.com> wrote:
From: Galder Zamarreno <galder(a)redhat.com>
Subject: Re: [infinispan-dev] HotRod?? [ISPN-29 and a "custom protocol"]
To: infinispan-dev(a)lists.jboss.org
Date: Friday, December 11, 2009, 2:30 AM
On 12/10/2009 08:19 PM, Alex Kluge wrote:
>>>> Yeah, u can get the cas id via gets
command.
>>>>
>>>>>
>>>>>> In the txt protocol, I implemented
the cas id using System.nanoTime().
>>>>>
>>>>> Is this cas id meant to be
unique? If so, System.nanoTime() may not be ...
>>>>
>>>> Not over time. You want to be unique from
the minute you retrieved it
>>>> until you use it and for operations on
that particular key.
>>>>
>>>> The cas Id is simply used for comparison,
to check whether someone else
>>>> has changed the entry since you retrieved
it. System.nanoTime() gives
>>>> you precisely that, anyone that modifies
that entry in that machine will
>>>> definitely have a different nano time.
Also, since nano time is based
>>>> off an arbitrary time, the chances that a
different machine will produce
>>>> the same nano time when modifying that
very same key are almost very
>>>> very small.
>>>
>>> Improbable but certainly not impossible, even
in 1 machine.
>>>
>>> Why not just use what we use for 'versioning'
internally - object references?
System.identityHashCode(), for example?
>>
>> Remember that you have to deal with
eviction/expiration too. You could
>> get a case like this:
>> - get a cas based on identity hash code
>> - the entry is expired
>> - a new entry is put, with diff values, which
happens to have the same
>> identify hash code.
>> - doing a cas will succeed when it shouldn't.
>>
>> I don't think this can happen with
System.nanoTime().
>>
>> When looking into this, I checked java.util.UUID
but it didn't work, for
>> at least for the memcached txt protocol, where a
64bit value is required
>> and UUID is 128bits. Maybe we could try to
compress it somehow?
>
>> Actually, according to the memcached-txt protocol
definition, this is a
>> 64-bit integer, so no real chance of compacting
that. You need to
>> provide a long of some sort:
>>
>>
http://github.com/memcached/memcached/blob/master/doc/protocol.txt
>
> Is there any expected use of this
value where it will be used outside of
> the context of a specific key? If not, then it does
not have to be
> globally unique, and indeed, can simply be a
modification count.
> Easily computed, and easily fit within 64 bits even
over a long lifetime
> of a server.
To do that, might as well use System.nanoTime(). It's much
better
because than modification count because each VM will start
off a
different offset.
In a cluster, you can't just have an atomic long starting
at 0 that you
increase everytime a node is modified, because you could
easily find
yourself in a situation where two nodes modifying the same
key might
generate the same cas id. If you don't start at 0, that's
pretty much
what you get with System.nanoTime() and I think it's better
to call that
all the time, than taking the given long and adding one at
the time.
Someone might think that you could have a counter per cache
entry but
this definitely cannot start at 0, since after eviction,
when the entry
is gone, it'll go back to 0 and you can have issues like
the one I
explained above.
Given the constraints of at least the memcached txt
protocol, I think
System.nanoTime() is the best option in that case. For
hotrod, a 128-bit
UUID would avoid issues mentioned here. We'd only need to
make sure to
find a performant solution.
>
> For now though the real question is
can we fit a quantity that will
> fulfil the requirements for the field into the space
alloted. I think
> we can.
>
>
>
Alex
>
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev(a)lists.jboss.org
>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
_______________________________________________
infinispan-dev mailing list
infinispan-dev(a)lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev