[hibernate-issues] [Hibernate-JIRA] Commented: (HHH-5490) dirty data be inserted into 2L cache

Strong Liu (JIRA) noreply at atlassian.com
Wed Sep 15 12:15:22 EDT 2010


    [ http://opensource.atlassian.com/projects/hibernate/browse/HHH-5490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=38410#action_38410 ] 

Strong Liu commented on HHH-5490:
---------------------------------

Aug 31 00:11:55 <stliu_>	sebersole, sorry, network dropped
Aug 31 00:12:04 <sebersole>	hardy_: i talked to christian
Aug 31 00:12:12 <stliu_>	last msg [23:53]  <stliu> yes, that's true, but can't we delay the cache happens after tx complete?
Aug 31 00:12:30 <sebersole>	he has not done any of the docbook integration work
Aug 31 00:12:44 <hardy_>	i see
Aug 31 00:13:08 <sebersole>	and would rather just sit there and tell me how i really ought to just switch to xhtml+lemma and write translation tools myself
Aug 31 00:13:22 <sebersole>	(if you know christian, you know what i mean) ;)
Aug 31 00:13:32 <hardy_>	i know exactly what you mean
Aug 31 00:13:36 <sebersole>	stliu_: you missed alot\
Aug 31 00:14:00 <sebersole>	[10:55] <sebersole> take the second case...  how would that work?
Aug 31 00:14:11 <sebersole>	[11:07] <sebersole> to answer your question specifically, i dont know that we dont wait already
Aug 31 00:14:15 <sebersole>	[11:08] <sebersole> the problem is that we see "authoritative reads" come "over top of it"
Aug 31 00:14:20 <sebersole>	[11:08] <sebersole> like i said, putFromRead makes certain assumptions
Aug 31 00:14:25 <sebersole>	[11:09] <sebersole> in the first case (refresh) there is a way to circumvent that, because we know the entity's EntityEntry
Aug 31 00:14:29 <sebersole>	[11:10] <sebersole> we can check it for the entry's current status and act accordingly
Aug 31 00:14:33 <sebersole>	[11:11] <sebersole> (essentially route the cache data refresh logic through the same process that occurs for generated property values, whcih is what those people should be doing anyway)
Aug 31 00:14:38 <sebersole>	[11:11] <sebersole> as i said, though, i am not sure what you base that decision on in this second case
Aug 31 00:15:17 <sebersole>	(and jira is back)
Aug 31 00:18:15 <stliu_>	looking
Aug 31 00:19:37 <sebersole>	anyone have thoughts on HHH-2224
Aug 31 00:19:38 <jbossbot>	jira [HHH-2224] executeUpdate causes coarse cache invalidation [Open, Major, Unassigned] http://opensource.atlassian.com/projects/hibernate/browse/HHH-2224
Aug 31 00:19:50 <sebersole>	i find that i am sometimes too  strict
Aug 31 00:20:22 <sebersole>	basically they are asking for a way to not have bulk hql operations invalidate the query cache
Aug 31 00:20:47 <sebersole>	grr, sorry
Aug 31 00:20:52 <sebersole>	not the query cache
Aug 31 00:20:58 <sebersole>	the second level cache
Aug 31 00:40:51 <sebersole>	stliu_: out of curiosity, the entity is not "served" from the L2 cache right (in the second case)?
Aug 31 00:41:29 <stliu_>	the second get?
Aug 31 00:41:34 <sebersole>	right
Aug 31 00:41:54 <sebersole>	no
Aug 31 00:41:58 <sebersole>	sorry, the first
Aug 31 00:42:04 <stliu_>	yes
Aug 31 00:42:42 <stliu_>	it haven't been putted into 2l cache by the insert
Aug 31 00:44:53 <sebersole>	what cache?
Aug 31 00:45:15 <sebersole>	for a non-transactional cache it should have
Aug 31 00:45:31 <sebersole>	but it should have just been a "holder" + lock
Aug 31 00:45:42 <sebersole>	just trying to make sure that part is all working
Aug 31 00:46:32 <sebersole>	pretty sure the first case is not overly difficult to account for, ust trying to fathom this second case..
Aug 31 00:48:32 <stliu_>	readwritecache
Aug 31 00:48:42 <stliu_>	sebersole, ^^
Aug 31 00:48:50 <sebersole>	stliu_: yep, ^^
Aug 31 00:52:09 <sebersole>	epbernard: tbh, i am looking back at these tutorials and I am not see all that much that really requires having the source code "there"
Aug 31 00:52:25 <sebersole>	*maybe* the first usage of Session
Aug 31 00:53:37 <sebersole>	*maybe* the hbm.xml for the date property
Aug 31 00:54:15 <sebersole>	i mean really looking at it
Aug 31 00:55:00 <epbernard>	sebersole: yes but you know hibernate very well and tutorials are for dummies (no offence, just stating a state)
Aug 31 00:55:18 <epbernard>	so seing what your code will look like makes a big difference
Aug 31 00:56:11 <sebersole>	i think i am being pretty objective
Aug 31 00:56:23 <sebersole>	and i have actually asked some users :)
Aug 31 00:56:45 <sebersole>	they find the callout-ed code blocks we use over-the-top
Aug 31 00:56:55 <sebersole>	thats been pretty consistent
Aug 31 00:58:00 <sebersole>	so rather than broad "we need code, I'm just suggesting we actually look at it and decide (judiciously) where
Aug 31 00:58:51 <sebersole>	like i cannot see any benefit to fragments from the hibernate.cfg.xml file
Aug 31 00:58:59 <sebersole>	how is that anything but fluff
Aug 31 00:59:34 <sebersole>	the entity?  eh, again, why.  its a bland boring java bean
Aug 31 11:13:44 <sebersole>	stliu_: you need any guidance on that case?
Aug 31 11:13:55 <sebersole>	about to drop off
Aug 31 11:14:03 <stliu_>	sebersole, yeah, please
Aug 31 11:15:23 <stliu_>	sebersole, i mean the guidance :D
Aug 31 11:17:20 <sebersole>	you have specific questions?
Aug 31 11:18:18 <stliu_>	i dont sure where to apply the entityentry's state you mentioned
Aug 31 11:19:52 <sebersole>	in the refresh event listener
Aug 31 11:20:40 <sebersole>	it sort of depends on its status
Aug 31 11:20:48 <sebersole>	what is the status on the refresh?
Aug 31 11:21:05 <sebersole>	it was flushed, so MANAGED?
Aug 31 11:21:08 <stliu_>	loading
Aug 31 11:21:39 <sebersole>	before the refresh listener sets it to that i mean
Aug 31 11:21:55 <sebersole>	it'll either be SAVING or MANAGED
Aug 31 11:22:13 <stliu_>	yeah
Aug 31 11:22:22 <sebersole>	well which? ;)
Aug 31 11:22:30 <sebersole>	its a very important distinction
Aug 31 11:24:02 <stliu_>	it is managed
Aug 31 11:25:14 <sebersole>	ugh
Aug 31 11:28:17 <sebersole>	essentially we need a way to identify entities which were created in this current transaction, and we dont have that in either case here
Aug 31 11:29:33 <sebersole>	i quick hack would be to keep a journal of the EntityKeys created on the PC and clear it on transaction completion
Aug 31 11:30:31 <stliu_>	sebersole, i also looked org.hibernate.engine.TwoPhaseLoad.initializeEntity(Object, boolean, SessionImplementor, PreLoadEvent, PostLoadEvent)
Aug 31 11:31:03 <sebersole>	tbh, i wonder if we have the same issue with generated properties
Aug 31 11:31:11 <stliu_>	i'm wondering is it possible to use a AfterTransactionCompletionProcess instead of put to cache after load immediately 
Aug 31 11:31:42 <sebersole>	god lor no
Aug 31 11:32:09 <sebersole>	what does that buy you anyway?
Aug 31 11:32:17 <sebersole>	you'd still do the put
Aug 31 11:33:33 <stliu_>	but we can check if the tx success
Aug 31 11:34:16 <sebersole>	then how do you propse to stop other sessions from writing that data to cache?
Aug 31 11:35:37 <sebersole>	i think you are trying to "fix it up on the back end"
Aug 31 11:35:56 <sebersole>	the problem is not the put into the cache on refresh
Aug 31 11:36:22 <sebersole>	the problem is that we lose sight of the fact that this entity was just inserted and is not permenant yet
Aug 31 11:37:48 <stliu_>	is it only just inserted? what about updated
Aug 31 11:38:39 <sebersole>	in 99% of cases when we read data from a result set it is safe to assume it is persistent data (as in released from dbs transaction log)
Aug 31 11:39:13 <sebersole>	well in your first use case thats simply not a concern
Aug 31 11:41:34 <sebersole>	stliu_: did you try this first case with generated properties?
Aug 31 11:41:48 <stliu_>	yes, i did
Aug 31 11:41:53 <sebersole>	and?
Aug 31 11:42:13 <stliu_>	the current test case i'm using is with a generated property
Aug 31 11:42:14 <stliu_>	@Generated(GenerationTime.ALWAYS)
Aug 31 11:42:14 <stliu_>		@Column(columnDefinition="bigint default '1'")
Aug 31 11:42:14 <stliu_>		private Long price;
Aug 31 11:42:33 <stliu_>	same behavior
Aug 31 11:47:37 <sebersole>	so probably these 2 cases (refresh and generation) need to be handled specially in terms of putting into the cache
Aug 31 11:48:24 <sebersole>	though, tbh, i dont see how property-generation does the same incorrect thing
Aug 31 11:49:19 <sebersole>	it simply reads the generated state and rebuilds the "state to cache" prior to ever calling the cache
Aug 31 11:54:43 <stliu_>	yes, but when do it with refresh, internally it uses twophaseload to load the entity, and with in this class, it performs the cacheing
Aug 31 11:55:38 <sebersole>	[22:42] <stliu_> same behavior
Aug 31 11:55:52 <sebersole>	i asked if property generation worked or failed
Aug 31 11:55:59 <sebersole>	you said it had the same behavior
Aug 31 11:56:25 <stliu_>	the property generation worked but the test case fails
Aug 31 11:56:39 <sebersole>	um, huh
Aug 31 11:57:20 <sebersole>	does property generation suffer from this same thing?
Aug 31 11:58:00 <stliu_>	no, actually not much relates to property generation
Aug 31 11:58:39 <sebersole>	i have no idea what that means
Aug 31 11:59:44 <stliu_>	i know the refresh is for property generation
Aug 31 12:01:06 <stliu_>	this bug does not releates to the property generation code probably  
Aug 31 12:01:34 <stliu_>	like you said, the data is not permenant yet, see case 2
Aug 31 12:01:43 <sebersole>	stliu_: look at it like this...
Aug 31 12:02:06 <sebersole>	your first use case is really a property generation use case
Aug 31 12:02:27 <sebersole>	and propertu generation works as designed and as it is supposed to
Aug 31 12:02:31 <sebersole>	so...
Aug 31 12:02:52 <sebersole>	doesnt it make sense to have this use case operate more like that>
Aug 31 12:03:14 <sebersole>	i mean thats just a high level swag
Aug 31 12:03:28 <sebersole>	but that seems pretty logical/reasonable right?
Aug 31 12:03:33 <stliu_>	yes
Aug 31 12:04:34 <sebersole>	the problem there is that we lose the "context" there on the entity entry
Aug 31 12:05:05 <sebersole>	such that when we come back into it for the refresh we have no idea that we just inserted or updated it in this txn
Aug 31 12:05:36 <sebersole>	to me, thats the issue
Aug 31 12:05:43 <stliu_>	yes, agree
Aug 31 12:06:10 <sebersole>	delaying putting reads from the database into the cache until commit it not a resolution
Aug 31 12:06:32 <sebersole>	anyway, all that is for the refresh case
Aug 31 12:06:42 <sebersole>	the clear case is different
Aug 31 12:07:12 <sebersole>	tbh, not sure what we would do there
Aug 31 12:07:19 <stliu_>	yes, i think you're right :)
Aug 31 12:07:37 <stliu_>	that will make lots of changes, right
Aug 31 12:07:51 <sebersole>	stliu_: depends on how it is done


Aug 31 12:09:43 <sebersole>	even if we fixed it, property generation is far more efficient
Aug 31 12:10:01 <sebersole>	i mean sounds like bitching just for the sake of bitching
Aug 31 12:10:54 <sebersole>	i can get to it late wed or into thursday
Aug 31 12:11:16 <sebersole>	but really see above
Aug 31 12:11:25 <stliu_>	okay, thanks
Aug 31 12:12:47 <sebersole>	stliu_: i mean as of now, perhaps the only *real* solution s to invalidate the corresponding second level cache entries on clear

Aug 31 12:17:16 <sebersole>	if you clear the session of entities that were inserted or updated during this transaction we for the cache entires to invalidate after trasaction
Aug 31 12:17:41 <sebersole>	on fail, that is
Aug 31 12:17:50 <sebersole>	since we simply do not know
Aug 31 12:18:50 <sebersole>	eventually i can see a "gatekeeper" for the second level cache
Aug 31 12:19:17 <sebersole>	one per session i mean
Aug 31 12:20:16 <sebersole>	it could keep track of keys added for various operations
Aug 31 12:20:27 <sebersole>	and on rollback do some clean up
Aug 31 12:20:46 <stliu_>	will it cause other concurrent sessions see the dirty data before the *insert* session's tx complete/rollback?
Aug 31 12:21:08 <sebersole>	it wont do anything different than what it does right now
Aug 31 12:21:19 <sebersole>	in terms of putting stuff into the cache
Aug 31 12:21:52 <sebersole>	but for example, when we ask the gatekeeper to do delegate an insert...
Aug 31 12:22:07 <sebersole>	org.hibernate.cache.access.EntityRegionAccessStrategy#insert
Aug 31 12:22:19 <sebersole>	it'll keep track of that key
Aug 31 12:22:33 <sebersole>	and the fact that it was inserted during the current transaction
Aug 31 12:23:54 <sebersole>	when it sees the "put from read" it can see that the same key was also added via insert previously
Aug 31 12:24:07 <sebersole>	really its just doing some journaling
Aug 31 12:24:11 <sebersole>	so that...
Aug 31 12:24:32 <sebersole>	if/when the txn rollbacks back we know that we need to invalidate said key
Aug 31 12:25:27 <sebersole>	another option is to make it smart in terms of how and when it releases the cache puts
Aug 31 12:25:55 <sebersole>	aka in the above case, we already have the cache entry in place from the insert
Aug 31 12:26:16 <sebersole>	we could delay the put from read until successful transaction completion
Aug 31 12:26:35 <sebersole>	stliu_: make sense?
Aug 31 12:26:40 <stliu_>	this is what i was looking for?
Aug 31 12:26:47 *	?// :Unknown command
Aug 31 12:26:55 <stliu_>	s/?//
Aug 31 12:26:55 <sebersole>	no you wanted to delay all puts
Aug 31 12:27:04 <sebersole>	oh ok
Aug 31 12:27:25 <sebersole>	but yes, that is a lot of work
Aug 31 12:27:32 <sebersole>	either way
Aug 31 12:28:47 <sebersole>	anyway, off to bed
Aug 31 12:28:51 <sebersole>	night all


> dirty data be inserted into 2L cache 
> -------------------------------------
>
>                 Key: HHH-5490
>                 URL: http://opensource.atlassian.com/projects/hibernate/browse/HHH-5490
>             Project: Hibernate Core
>          Issue Type: Bug
>          Components: caching (L2)
>    Affects Versions: 3.5.5, 3.6.0.Beta3
>            Reporter: Strong Liu
>            Assignee: Steve Ebersole
>         Attachments: Item.java, ItemTest.java
>
>
> {code}
> 	public void testInsertWithRefresh() {
> 		getSessions().getCache().evictEntityRegions();
> 		getSessions().getStatistics().clear();
> 		
> 		Session s = openSession();
> 		s.beginTransaction();
> 		Item item = new Item();
> 		item.setName("stliu");
> 		s.save(item);
> 		s.flush();
> 		s.refresh(item);
> 		s.getTransaction().rollback();
> 		s.close();
> 		
> 		Map cacheMap = getSessions().getStatistics()
> 				.getSecondLevelCacheStatistics("item").getEntries();
> 		assertEquals(0, cacheMap.size());
> 		
> 		s = openSession();
> 		s.beginTransaction();
> 		item = (Item)s.get(Item.class, item.getId());
> 		s.getTransaction().commit();
> 		s.close();
> 		
> 		assertNull("it should be null", item);
> 	}
> {code}
> see above test case, since the insertion is rollbacked, so, there is no that row in the DB, but you can see the null assertion will fail due to the dirty data be inserted into the 2l cache after refresh operation.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://opensource.atlassian.com/projects/hibernate/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

       



More information about the hibernate-issues mailing list