[infinispan-dev] JGRP-1877

Bela Ban bban at redhat.com
Wed Sep 10 09:08:30 EDT 2014



On 10/09/14 13:58, Alan Field wrote:
> Hey Bela,
>
>> Just a quick heads up. I'm currently working on
>> https://issues.jboss.org/browse/JGRP-1877, which it critical as it
>> may: - cause RPCs to return prematurely (possibly with a
>> TimeoutException), or - cause RPCs to blocks for a long time (pick
>> which one is worse :-))
>
> How frequently can these errors occur? Is this something that is not
> very likely to happen or something that requires an external action
> to trigger it? (i.e. changing the time via NTP) Just trying to
> determine the priority of this issue.


Changing the system time will definitely screw up code that relies on 
System.currentTimeMillis(). Once I replace this with nanoTime(), this 
problem should be eliminated.

The nanoTime() problem is that an 'origin' chosen by the JVM can be in 
the future, so all calls to nanoTime() return negative values. Or - if 
positive - due to numeric overflow, the long can wrap around and become 
negative.

Once this happens, all RPCs (for example) will return immediately, 
without any response, or throw TimeoutExceptions. This will last for 292 
years... :-)


> Thanks, Alan
>
> ----- Original Message -----
>> From: "Bela Ban" <bban at redhat.com> To:
>> infinispan-dev at lists.jboss.org Sent: Wednesday, September 10, 2014
>> 12:05:11 PM Subject: [infinispan-dev] JGRP-1877
>>
>> Just a quick heads up. I'm currently working on
>> https://issues.jboss.org/browse/JGRP-1877, which it critical as it
>> may: - cause RPCs to return prematurely (possibly with a
>> TimeoutException), or - cause RPCs to blocks for a long time (pick
>> which one is worse :-))
>>
>> This is due to my misunderstanding of the semantics of
>> System.nanoTime(), I frequently have code like this, which computes
>> a future deadline for a timeout:
>>
>> long wait_time=TimeUnit.NANOSECONDS.convert(timeout,
>> TimeUnit.MILLISECONDS); final long target_time=System.nanoTime() +
>> wait_time; while(wait_time > 0 && !hasResult) { /* Wait for
>> responses: */ wait_time=target_time - System.nanoTime();
>> if(wait_time > 0) { try {cond.await(wait_time,
>> TimeUnit.NANOSECONDS);} catch(Exception e) {} } } if(!hasResult &&
>> wait_time <= 0) throw new TimeoutException();
>>
>> Variable target_time can possibly become *negative* if nanoTime()
>> returns a negative value. If so, hasResult is false and wait_time
>> negative, and therefore a TimeoutException would be thrown !
>>
>> While I'm at it, I'll also fix my uses of
>> System.currentTimeMillis(), and replace it with nanoTime(). Our
>> good friend Erik has run into issues with RPCs (using
>> currentTimeMillis()) hanging forever when their NTP-based servers
>> adjusted the time .... backwards !
>>
>> Please be aware of nanoTime() in your own code, e.g. long
>> t0=nanoTime(); ... long t1=nanoTime();
>>
>> It is *not* guaranteed that t1 > t0 because of numeric overflow
>> (t0 might be Long.MAX_VALUE-1 and t1 Long.MAX_VALUE +2 !). The only
>> way to compare them is t1 - t0 > 0 (t1 is more recent) or < 0 t0 is
>> more recent.
>>
>> Just thought I wanted to pass this on, in case somebody made the
>> same stupid mistake...
>>
>> Thanks to David Lloyd for pointing this out !
>>
>> -- Bela Ban, JGroups lead (http://www.jgroups.org)
>> _______________________________________________ infinispan-dev
>> mailing list infinispan-dev at lists.jboss.org
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
> _______________________________________________ infinispan-dev
> mailing list infinispan-dev at lists.jboss.org
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>

-- 
Bela Ban, JGroups lead (http://www.jgroups.org)


More information about the infinispan-dev mailing list