[
https://issues.jboss.org/browse/ISPN-761?page=com.atlassian.jira.plugin.s...
]
Sanne Grinovero commented on ISPN-761:
--------------------------------------
We discussed this on IRC:
{quote}
wburns: mmarkus, anistor, dberindei, galderz, pruivo, ttarant: sannegrinovero: was looking
for input on ISPN-761
wburns: mmarkus brought up a point about non passivated caches as well - and this led me
to find an issue with how size would work currently
wburns: basically you have a non passivated cache that has a user use the SKIP_CACHE_STORE
flag etc. could cause inconsistencies with size method as the cache store is no longer a
superset of the memory store
sannegrinovero: wburns: IMHO #size() should be deleted.
wburns: which would require reading all keys in memory and in the cache store to correctly
determine size
sannegrinovero: wburns: that might not happen in the near future, but just saying that I
wouldn't spend too much time on it.
wburns: sannegrinovero: 1 thought we had was not to worry about this edge case since it is
rarely used Flag outside of internal infinispan
sannegrinovero: wburns: no, size() alreasy is documented that it only returns the size in
the (local) data container, so I don't think it should load from the CacheStore at
all.
wburns: sannegrinovero: the Jira 761 is to change that and as galderz pointed out I
updated the javadoc on Cache
sannegrinovero: wburns: SKIP_CACHE_STORE? no it's super useful for client apps as
well. Quite critical actually.
wburns: for write operations for a new key that is
sannegrinovero: wburns: what's the reason to change that? -1 from me.
wburns: sannegrinovero: it is what people want I guess :/
sannegrinovero: wburns: we can't listen to all, it's primarily important that the
API stays consistent and clear.
mmarkus: I think size should also include the cache store
sannegrinovero: wburns: did you read the comments on the issue? It was moved to 6.0 so
that it could be implemented because, as Manik states there, it requires changes in the
CacheStore SPI to be implemented afficiently.
sannegrinovero: efficiently.
sannegrinovero: mmarkus: ^
mmarkus: if people want to ignore it they can always skip it for persistence
mmarkus: sannegrinovero: I'm not aware of any use cases that skip store *writes* with
flags, any example?
sannegrinovero: mmarkus: these methods are unreliable anyway as they return just the view
of the single node, people shouldn't use them at all or just in statistics (knowing
the limitations)
sannegrinovero: mmarkus: there are dozens of examples of that in the Lucene Directory
module.
sannegrinovero: mmarkus: I made the flag ;)
wburns: sannegrinovero: so you are the culprit :P jk
mmarkus: sannegrinovero: well directory module is kindof internal though :-)
sannegrinovero: mmarkus: true, but it also is an example of an application dealing with
the Infinispan API.
wburns: sannegrinovero: but yes in regards to efficiency I would agree - I have added
comments to the design doc to address those deficiencies as well
mmarkus: sannegrinovero: wburns: and agreed that involving cache stores makes more sense
for local caches
sannegrinovero:wburns: mmarkus: my vote would go to throw an
"UnsupportedOperationException" if we can't remove the ConcurrentMap API
;-)
mmarkus: sannegrinovero: on size() & co?
sannegrinovero: wburns: mmarkus: yes. unless you prefer to "throw
OutOfMemoryException() " ... :D
mmarkus: sannegrinovero: −1. I think they make sense for local caches in particular
sannegrinovero: mmarkus: makes sense for local caches without CacheStore. IFF I need a
CacheStore, and I am not loading it all in memory anyway, it likely is because there is
not enough memory.
mmarkus: sannegrinovero: you might want to get all the keys in the system for certain
scenarios and currently there's no way to do that
mmarkus: sannegrinovero: the way the currently are these methods are broken as they
won't return that. Users have been complaining about it for quite a while.
sannegrinovero: mmarkus: I know, and we discussed this many times on the mailing list. As
I said to wburns, this issue was moved to 6.0 so that we would fix the CacheStore SPI to
allow for better implementation of size() or iterate()
mmarkus: sannegrinovero: also if they won't the old functionality they can use the
SKIP_STORE flag ;)
sannegrinovero: mmarkus: also Map/Reduce is WORTHLESS until you fix that SPI method.
sannegrinovero: mmarkus: as it has the same problem: needs to iterate (map) all entries,
but they just don't fit in memory.
wburns: sannegrinovero: I wasn't around when that discussion took place... is that for
a streaming SPI or something?
sannegrinovero: mmarkus: I'm not against implementing it, no doubt that it could be
useful. It's the implementation which is simply not correct.
sannegrinovero: wburns: you are familiar with the Map/Reduce pattern?
mmarkus: sannegrinovero: +1 to implement them more efficiently, but and the required
optimisations were tracked in ISPN-3290
wburns: sannegrinovero: yeah
sannegrinovero: mmarkus: fixed the dependency for you on JIRA.
mmarkus: sannegrinovero: you have a point but it's not a hard dependecy
sannegrinovero: mmarkus: it's not a hard dependency if you document those methods with
"throws OOException most of the times"
sannegrinovero: mmarkus: BTW why would you want to fix the issue twice? this is doubling
the work if wburns has to fix it now and then fix it again to use the new SPI
mmarkus sannegrinovero: the effort duplication should be small though
sannegrinovero: mmarkus: your call. I just hope nobody will use that method.
wburns: sannegrinovero: what was the part about Map/Reduce? is it basically that cache
store entries aren't used - and we want to add a streaming API to go through them by
chunks instead of all the contents?
sannegrinovero: wburns: CacheStores are actually used, but as it is today to run a
Map/Reduce job it is loading all entries in memory to run the first Map part. Which is
pretty much defeating the purpose of the pattern.
wburns: sannegrinovero: guessing we have optimizations for shared at least :) - but
basically is just to stream through the cache store contents
sannegrinovero: wburns: to take advantage of the pattern, it should pass the mapping
definition to the store engine to have it narrow down execution (ideally) or (more
practically) to run it on an iterator from the CacheStore itself to not run in OOM at
least.
sannegrinovero: wburns: no optimisations whatsoever :(
mmarkus: sannegrinovero: I imagine it's the same way you're thinking to have
keySet() and entries() implemented as well
sannegrinovero: wburns: and that's another point for the SPI needing improvements. At
least the option to load only the entries owned by the current node should be added.
mmarkus: sannegrinovero: use keySet().iterator() to iterate over the set of keys
mmarkus: sannegrinovero: ^^?
wburns: sannegrinovero: ouch
sannegrinovero: mmarkus: exactly. But think about size() to keep it simple.. just add a
size() method on the cacheStore first, then the JDBC one (for example) could run a simple
COUNT instead of attempting to load 100GB in memory
mmarkus: sannegrinovero: yep, that's noted
{quote}
Cache.keySet(),entrySet(),values(),size() ignore contents of cache
loader
-------------------------------------------------------------------------
Key: ISPN-761
URL:
https://issues.jboss.org/browse/ISPN-761
Project: Infinispan
Issue Type: Bug
Components: Loaders and Stores
Affects Versions: 4.2.0.BETA1
Reporter: Paul Ferraro
Assignee: William Burns
Priority: Critical
Labels: onboard, potential_performance_hit
Fix For: 6.0.0.Alpha1, 6.0.0.Final
Passivated cache entries are not represented in values returned by the keySet(),
entrySet(), values(), and size() Cache methods. This results in inconsistent behavior,
since it is possible that a given cache key may not be contained in keySet() even though
Cache.get(...) would return a non-null value if the entry was previously passivated.
I think CacheLoaderInterceptor needs to implement the following methods:
Object visitSizeCommand(InvocationContext ctx, SizeCommand command) throws Throwable
Object visitValuesCommand(InvocationContext ctx, ValuesCommand command) throws
Throwable
Object visitEntrySetCommand(InvocationContext ctx, EntrySetCommand command) throws
Throwable
Object visitKeySetCommand(InvocationContext ctx, KeySetCommand command) throws
Throwable
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:
http://www.atlassian.com/software/jira