[Design of JBossCache] - Re: Custom data versions
by bstansberry@jboss.com
Jason and I had a good IM discussion of versioning issues; posting it here for the record:
anonymous wrote :
| (03:18:30 PM) Jason Greene: regarding this versioning discussion
| (03:18:38 PM) Jason Greene: i have this feeling we are overlooking something
| (03:18:53 PM) Jason Greene: i recall a very long discussion with gavin and steve about this last year
| (03:19:34 PM) besYIM: LOL. that's why I said go slow
| (03:19:53 PM) Jason Greene: which brings me to my question
| (03:20:30 PM) Jason Greene: does the hibernate integration require all writers to the db to use hibernate
| (03:20:49 PM) Jason Greene: in other words is it "option A" as it used to be called in CMP land
| (03:21:15 PM) besYIM: i was going to answer and then you threw in option A and confused me :)
| (03:21:17 PM) Jason Greene: because if it doesnt we need versions
| (03:22:20 PM) Jason Greene: well by option A i mean "hibernate has exclusive write access to db"
| (03:22:48 PM) besYIM: if data is cached and someone updates the db outside of hibernate, the cache will be incorrect until eviction flushes that data out
| (03:23:20 PM) Jason Greene: ok because the scenario i came up with
| (03:23:31 PM) besYIM: yeah, the PFER semantic of aborting if node exists does not include any ability to analyze the node
| (03:23:47 PM) besYIM: that's true even w/ OPTIMISTIC
| (03:23:48 PM) Jason Greene: is that 2 readers
| (03:23:52 PM) Jason Greene: see different values
| (03:24:08 PM) Jason Greene: since db writers are happening in a different ap
| (03:24:29 PM) Jason Greene: then its a race to which value wins
| (03:24:39 PM) Jason Greene: and it wont necessarily be the most current
| (03:24:53 PM) Jason Greene: if however there was a version
| (03:25:04 PM) Jason Greene: this would not be a problem
| (03:26:24 PM) Jason Greene: also async replication
| (03:26:31 PM) Jason Greene: would make versions important
| (03:26:54 PM) Jason Greene: since two conflicting updates would arrive out of order
| (03:27:01 PM) Jason Greene: s/would/may
| (03:27:16 PM) besYIM: ok, let's define some terms so we're on the same page:
| (03:27:28 PM) besYIM: insert/update --> db write, cache write
| (03:27:36 PM) besYIM: put --> db read, cache write
| (03:28:15 PM) besYIM: an insert/update should not go async; if it does it's a misuse, or at least means you're not concerned about consistency
| (03:28:24 PM) Jason Greene: ok yeah
| (03:29:14 PM) besYIM: a put can go async (if replicated). but there its a PFER and aborts if the node already exists
| (03:29:49 PM) besYIM: and the node would only exist w/ out-of-date data if the db was updated externally
| (03:30:28 PM) Jason Greene: right, what about transaction commit order
| (03:30:36 PM) Jason Greene: db commits first
| (03:30:42 PM) Jason Greene: or cache committs first
| (03:30:45 PM) Jason Greene: which order is it?
| (03:30:49 PM) besYIM: db first
| (03:31:06 PM) besYIM: wait, let me try again
| (03:31:33 PM) besYIM: hibernate flushes to db
| (03:31:56 PM) besYIM: beforeCompletion (prepare) phase on cache
| (03:32:00 PM) besYIM: db commits
| (03:32:17 PM) besYIM: afterCompletion() phase in hibernate
| (03:32:24 PM) besYIM: afterCompletion() phase in JBC
| (03:33:43 PM) Jason Greene: ok so back to the put
| (03:34:41 PM) Jason Greene: put is async, so technically could happen after a insert, which is why manik was suggesting a write lock used in PFER right?
| (03:36:01 PM) besYIM: yes, it could happen after an insert. but a PFER should abort if the node is already present, it shouldn't try to lock the node
| (03:36:14 PM) besYIM: how MVCC handles that internally, I don't know
| (03:38:20 PM) Jason Greene: right so its a race because server 1 does a pfer, node does not exist, and an async replication update occurs
| (03:38:52 PM) Jason Greene: server 2 does a insert which does a sync write, this could come before the async update
| (03:39:22 PM) besYIM: there is no such thing as an async replication update. there is an async put replication
| (03:39:39 PM) Jason Greene: right sorry async put replication is queued
| (03:39:49 PM) besYIM: (sorry, being anal so i understand)
| (03:39:54 PM) Jason Greene: sync update happens first
| (03:40:18 PM) besYIM: yeah and when the async PFER comes in, it aborts because the node already exists
| (03:40:48 PM) Jason Greene: ah because the node it is applying to checks for existance
| (03:41:01 PM) Jason Greene: not just the origin
| (03:41:42 PM) besYIM: yeah.
| (03:42:04 PM) besYIM: that's what allows async to work
| (03:43:19 PM) Jason Greene: ok so as long as hibernate has exclusive access I cant think of a problem
| (03:43:58 PM) besYIM: versioning could help the external-to-hibernate write scenario, but IMO only worth it if we don't introduce locking problems into PFER (during the version check)
| (03:44:13 PM) besYIM: which, TBH would be a nice improvement
| (03:44:22 PM) Jason Greene: well the versioning check could be a substitution for pfer
| (03:44:30 PM) Jason Greene: essentially you are adding a precondition
| (03:44:35 PM) Jason Greene: to every update
| (03:44:43 PM) Jason Greene: like update blah where v = 1
| (03:44:58 PM) Jason Greene: order is iirelevant
| (03:45:02 PM) Jason Greene: when you do that
| (03:45:23 PM) besYIM: but that precondition requires a read of the node, and hence a lock
| (03:46:01 PM) besYIM: the existing PFER says "i'm willing to give up some puts that may be ok to avoid any possibility of locking issues"
| (03:46:33 PM) Jason Greene: no locks required, because you can safely read any old version and the update would be ignored
| (03:46:41 PM) Jason Greene: it has to be an atomic operation though
| (03:47:46 PM) Jason Greene: a wierd psudocode expression to explain what i mean
| (03:47:46 PM) Jason Greene: :
| (03:48:15 PM) Jason Greene: put("/blah/1", "key", "value", "version == 3");
| (03:48:32 PM) Jason Greene: when thats applied a WL is obtained
| (03:48:36 PM) Jason Greene: the version is checked
| (03:48:46 PM) Jason Greene: and only if it is valid it is applied"
| (03:50:44 PM) besYIM: if the lock aquisition timeout is 0 ms that's probably OK
| (03:50:55 PM) Jason Greene: well this would all be async
| (03:51:03 PM) Jason Greene: for puts and for insert/update
| (03:51:16 PM) besYIM: not on the origin node.
| (03:52:26 PM) besYIM: ok there's another benefit to versions -- a slightly better semantic for async insert/update
| (03:52:28 PM) Jason Greene: ah right, yes, locks is aquired but its essentially atomic since everything would be async
| (03:53:06 PM) Jason Greene: oh and
| (03:53:09 PM) Jason Greene: its only aquired
| (03:53:19 PM) Jason Greene: if it doesnt exist locally and the version is different
| (03:53:26 PM) Jason Greene: so for example
| (03:54:04 PM) Jason Greene: int cacheVersion = getVersion("/a/b");
| (03:54:16 PM) Jason Greene: if (cacheVersion < dbVersion) put
| (03:54:54 PM) Jason Greene: of course the goal is not to look at the db
| (03:55:16 PM) Jason Greene: so you wouldnt do this all the time
| (03:55:26 PM) Jason Greene: external apps would still be a problem
| (03:56:18 PM) besYIM: yep. this versioning discussing only really helps if the origin node doesn't have it cached AND REPL_xxx is used
| (03:56:38 PM) besYIM: if the origin node has it cached, hibernate doesn't know about the db change
| (03:56:56 PM) Jason Greene: right so a versioning solution really just improves updates as you say
| (03:57:12 PM) besYIM: if INVALIDATION is used the PFER doesn't generate a cluster-wide msg
| (03:57:20 PM) Jason Greene: it doesnt solve the non-external problem
| (03:57:24 PM) Jason Greene: erm
| (03:57:28 PM) Jason Greene: non-exclusive hibernate
| (03:57:58 PM) Jason Greene: youd have to have some kind of eviction policy
| (03:58:06 PM) Jason Greene: which is what we do today right?
| (03:59:02 PM) besYIM: yes, that's the way to handle it. The Hibernate API also lets you flush the cache, i.e. if you had code that somehow was aware of the DB update it could tell hibernate to flush the cache
| (04:00:39 PM) Jason Greene: it seems that you still really want sync writes
| (04:00:50 PM) Jason Greene: because otherwise someone might see an old value
| (04:00:54 PM) besYIM: yep
| (04:00:55 PM) Jason Greene: although briefly
| (04:01:20 PM) Jason Greene: alright im convinced we dont need versions
| (04:01:23 PM) Jason Greene: :)
| (04:01:44 PM) besYIM: the "slightly better semantic for async insert/update" can be achieved with OPTIMISTIC, which has versions
| (04:02:04 PM) besYIM: although i guess that's going away
| (04:02:15 PM) Jason Greene: yeah but it still runs the risk
| (04:02:18 PM) Jason Greene: of a stale read
| (04:02:48 PM) besYIM: yes, just avoids the problem of an older update overwriting a newer
| (04:03:29 PM) Jason Greene: MVCC solves that with locks
| (04:03:31 PM) Jason Greene: however
| (04:03:43 PM) Jason Greene: there is a deadlock potential
| (04:03:48 PM) Jason Greene: like i mentioned in the thread
| (04:04:10 PM) Jason Greene: it would require simultaneous update/inserts
| (04:04:22 PM) Jason Greene: which should be unlikely
| (04:04:30 PM) Jason Greene: since the db holds a write lock
| (04:04:36 PM) Jason Greene: as well
| (04:05:12 PM) besYIM: yes. it would only be a problem w/ async since the db lock can be released. And async == no deadlock in JBC
| (04:06:44 PM) besYIM: you mind if i cut and paste the relevant part of this into the forum thread? this was a good discussion
| (04:06:51 PM) Jason Greene: sure go for it
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4164848#4164848
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4164848
17 years, 8 months
[Design of Messaging on JBoss (Messaging/JBoss)] - Re: Sweet life on the journal...
by clebert.suconic@jboss.com
anonymous wrote : Question: on the record sequence number - how can you be sure on old sequence number from a reclaimed file doesn't by chance match the correct sequence number - in this case you wouldn't be able to detect failure right?
Also... at the end of every record I'm also writing the used size, to validate the "health" of the record.
A record to be considered healthy needs the record type between 10 and 19, the currentFileId at the second Position to match the currentFileId being used (to validate it was not from a previous usage) and the checkSize at the end matching to the exact number of bytes written. This is also an extra check for APPEND and Update records.
If those conditions are not met, the record is considered broken, and I keep moving byte by byte on the journal file till I can find another byte between 10 and 19 where all the above conditions are met.
One possibility: We could "maybe" replace the checkSize by a hash calculation over the byte array content. That would eliminate any small possibility of coincidences like messages being magically created by crashed files.
And BTW: One thing I was wondering.. just fixing terminology. From now on.. I will say Reload to reading the journal files back to memory. And I will keep the word Recovery for the transaction only.
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4164822#4164822
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4164822
17 years, 8 months
[Design of JBossCache] - Re: Service threads
by bstansberry@jboss.com
Somewhat. I'm happy to access the existing EvictionPolicy associated with the region (which is part of the public API of Region, get the EvictionAlgorithm from that (which is part of the public EvictionPolicy API) and invoke process() on that (which is part of the public EvictionAlgorithm API).
Looking at the eviction system in JBC, it seems nicely set up to work a la carte:
1) An interceptor that generates events and passes them to the region, which queues them. (This is somewhat coupled to #2 via the EvictionPolicy.canIgnoreEvent() call the interceptor makes.)
2) The EvictionPolicy/Algorithm which can take a process the region's queue of events and determine what to evict, and then evict it.
3) The evict() API on the cache, which is used by EvictionPolicy/Algorithm but also allows self-managed eviction.
4) A thread-management system that kicks off #2.
So, main thing is I think JBC should support these combos from the a la carte menu:
#1 + #2 + #3 + #4 (of course)
#3
#1 + #2 + #3.
If configuring this last one requires doing something a bit hacky that's OK by me, although it's better to just be able to say in a config "no #4 please". Again, to me the main thing is to keep those API calls I described above available. [1]
I don't see any point in JBC trying to support #1 + #3 (i.e. add a way to bypass the EvictionInterceptor call to EvictionPolicy.canIgnoreEvent()). If someone wants that they can just implement a CacheListener to get events.
[1] Probably a good separate discussion is the API of EvictionPolicy/EvictionAlgorithm. Currently those are a bit of a mix of API and SPI, with an implementation detail (delegate to EvictionAlgorithm) mixed in.
View the original post : http://www.jboss.com/index.html?module=bb&op=viewtopic&p=4164818#4164818
Reply to the post : http://www.jboss.com/index.html?module=bb&op=posting&mode=reply&p=4164818
17 years, 8 months