[infinispan-dev] HotRod?? [ISPN-29 and a "custom protocol"]
Galder Zamarreno
galder at redhat.com
Mon Dec 14 11:55:42 EST 2009
On 12/11/2009 06:42 PM, Alex Kluge wrote:
>> In a cluster, you can't just have an atomic long starting at 0 that you
>> increase everytime a node is modified, because you could easily find
>> yourself in a situation where two nodes modifying the same key might
>> generate the same cas id.
>
> Hmmm, in a cluster no mater who writes the key/vale, they have to wind up
> modifying the same data, no? Otherwise the cluster becomes inconsistent.
> That is the point, where, on modification, you can increment the counter.
True. Assuming that you use sync mode, it will guarantee that even if
you use the same cas, the internal locking will avoid the inconsistency.
>
> Eviction, however, raises more challenges. Clearly, when a value is
> evicted, you can't simply restart the counter when the data is rewritten.
> This would confuse the clients. However, I also get uncomfortable putting
> something time consuming into the main write path. I guess I am a bit
> jaded from when System.currentTimeMillis() took a long time.
>
> But then again:
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6440250
>
> Also note from this that sometimes nanoTime just returns currentTimeMillis*10^6.
Interesting stuff. I understand your concers wrt the potential speed of
such calls.
>
> Perhaps setting an initial value in a time consuming way, then a simple
> increment thereafter. The goal is to get consistent, usable behaviour,
> while maintaining performance.
Let's go for this then. I did think of this solution but had thought
that currentTimeMillis/nanoTime, being native call, would be faster
keeping such counter.
>
> We definitely shouldn't use completely independent CAS values for
> different protocols.
>
> Got to run,
> Alex
>
> --- On Fri, 12/11/09, Galder Zamarreno<galder at redhat.com> wrote:
>
>> From: Galder Zamarreno<galder at redhat.com>
>> Subject: Re: [infinispan-dev] HotRod?? [ISPN-29 and a "custom protocol"]
>> To: infinispan-dev at lists.jboss.org
>> Date: Friday, December 11, 2009, 2:30 AM
>>
>>
>> On 12/10/2009 08:19 PM, Alex Kluge wrote:
>>>>>> Yeah, u can get the cas id via gets
>> command.
>>>>>>
>>>>>>>
>>>>>>>> In the txt protocol, I implemented
>> the cas id using System.nanoTime().
>>>>>>>
>>>>>>> Is this cas id meant to be
>> unique? If so, System.nanoTime() may not be ...
>>>>>>
>>>>>> Not over time. You want to be unique from
>> the minute you retrieved it
>>>>>> until you use it and for operations on
>> that particular key.
>>>>>>
>>>>>> The cas Id is simply used for comparison,
>> to check whether someone else
>>>>>> has changed the entry since you retrieved
>> it. System.nanoTime() gives
>>>>>> you precisely that, anyone that modifies
>> that entry in that machine will
>>>>>> definitely have a different nano time.
>> Also, since nano time is based
>>>>>> off an arbitrary time, the chances that a
>> different machine will produce
>>>>>> the same nano time when modifying that
>> very same key are almost very
>>>>>> very small.
>>>>>
>>>>> Improbable but certainly not impossible, even
>> in 1 machine.
>>>>>
>>>>> Why not just use what we use for 'versioning'
>> internally - object references?
>> System.identityHashCode(), for example?
>>>>
>>>> Remember that you have to deal with
>> eviction/expiration too. You could
>>>> get a case like this:
>>>> - get a cas based on identity hash code
>>>> - the entry is expired
>>>> - a new entry is put, with diff values, which
>> happens to have the same
>>>> identify hash code.
>>>> - doing a cas will succeed when it shouldn't.
>>>>
>>>> I don't think this can happen with
>> System.nanoTime().
>>>>
>>>> When looking into this, I checked java.util.UUID
>> but it didn't work, for
>>>> at least for the memcached txt protocol, where a
>> 64bit value is required
>>>> and UUID is 128bits. Maybe we could try to
>> compress it somehow?
>>>
>>>> Actually, according to the memcached-txt protocol
>> definition, this is a
>>>> 64-bit integer, so no real chance of compacting
>> that. You need to
>>>> provide a long of some sort:
>>>>
>>>> http://github.com/memcached/memcached/blob/master/doc/protocol.txt
>>>
>>> Is there any expected use of this
>> value where it will be used outside of
>>> the context of a specific key? If not, then it does
>> not have to be
>>> globally unique, and indeed, can simply be a
>> modification count.
>>> Easily computed, and easily fit within 64 bits even
>> over a long lifetime
>>> of a server.
>>
>> To do that, might as well use System.nanoTime(). It's much
>> better
>> because than modification count because each VM will start
>> off a
>> different offset.
>>
>> In a cluster, you can't just have an atomic long starting
>> at 0 that you
>> increase everytime a node is modified, because you could
>> easily find
>> yourself in a situation where two nodes modifying the same
>> key might
>> generate the same cas id. If you don't start at 0, that's
>> pretty much
>> what you get with System.nanoTime() and I think it's better
>> to call that
>> all the time, than taking the given long and adding one at
>> the time.
>>
>> Someone might think that you could have a counter per cache
>> entry but
>> this definitely cannot start at 0, since after eviction,
>> when the entry
>> is gone, it'll go back to 0 and you can have issues like
>> the one I
>> explained above.
>>
>> Given the constraints of at least the memcached txt
>> protocol, I think
>> System.nanoTime() is the best option in that case. For
>> hotrod, a 128-bit
>> UUID would avoid issues mentioned here. We'd only need to
>> make sure to
>> find a performant solution.
>>
>>>
>>> For now though the real question is
>> can we fit a quantity that will
>>> fulfil the requirements for the field into the space
>> alloted. I think
>>> we can.
>>>
>>>
>>>
>>
>> Alex
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>
>>> infinispan-dev mailing list
>>> infinispan-dev at lists.jboss.org
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>> --
>> Galder Zamarreño
>> Sr. Software Engineer
>> Infinispan, JBoss Cache
>> _______________________________________________
>> infinispan-dev mailing list
>> infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache
More information about the infinispan-dev
mailing list