[JBoss JIRA] Created: (TEIID-1673) Co-ordinate Materialization Table loads accross the cluster
by Ramesh Reddy (JIRA)
Co-ordinate Materialization Table loads accross the cluster
-----------------------------------------------------------
Key: TEIID-1673
URL: https://issues.jboss.org/browse/TEIID-1673
Project: Teiid
Issue Type: Enhancement
Components: Server
Affects Versions: 7.4
Reporter: Ramesh Reddy
Assignee: Steven Hawkins
This is the conversation captured between, rareddy, shawkins and sbrooks, for the refreshMatView system procedure functionality and how it behaves currently and proposed modfication.
(10:56:48 AM) rareddy: sbrooks: I read through the code; I understand how it does now;
(10:58:26 AM) rareddy: sbrooks: when invalidate == true; the distributed cache is set "invalidate", so all nodes have "invalidate" state. However, the node that received the "refreshMatView" continues on to load the new contents
(10:59:04 AM) rareddy: sbrooks: if the other nodes looking for the refreshed data before the load finishes they get get blocked.
(10:59:29 AM) rareddy: sbrooks: if the refresh finishes, then they get the new data
(11:01:06 AM) rareddy: sbrooks: if invalidate == false, then distributed cache does not get changed, and all nodes will keep serving the old data until the new data is refreshed, once the new data is available it is time stamped to the load time.
(11:12:42 AM) rareddy: sbrooks: now, how the other nodes behave after load, I can not seem to determine correctly. I think they just refresh the data from other node; may be shawkins can confirm
(11:17:34 AM) shawkins: rareddy: with invalidate false when the load finishes, then all nodes should pick up the updated state based upon comparing the local timestamp to the remote (see registerQuery in TempTableDataManager)
(11:21:10 AM) shawkins: rareddy: with invalidate true, things are more complicated. the remote nodes are not coordinated with respect to the load. there's a comment to that effect "//TODO: coordinate a distributed load". so with invalidate true you may load the data multiple times from the source for each node that is queried during the load in an invalid state.
(11:24:28 AM) rareddy: shawkins: but in the invalidate = true case, I see the update to the distributed cache key, would that not everybody to start loading if they see the state as loading?
(11:25:13 AM) rareddy: shawkins: let me re-phrase
(11:27:05 AM) rareddy: shawkins: since the distributed cache is invalidated, the other nodes in cache see that, the cache is being loaded at another node, can they not stop from doing their own load?
(11:29:00 AM) shawkins: rareddy: it's not a question of whether the load could be coordinated. it was implemented without coordination for simplicity
(11:30:04 AM) rareddy: shawkins: so, we should really recommend the "invalidate=false" until 5.2 to keep the data in sync then would you agree?
(11:30:13 AM) shawkins: rareddy: one possible path would be to expose jbosscache distributed node locking
(11:30:27 AM) shawkins: rareddy: no. 5.2 would not be any different
(11:32:18 AM) rareddy: shawkins: so what is the value prop for the invalidate=true? immediate invalidation?
(11:32:52 AM) shawkins: rareddy: correct you ensure that stale values are no longer used
(11:33:32 AM) rareddy: shawkins: but in the same case, it serves that same stale data until the load is finished
(11:33:49 AM) rareddy: or does it block?
(11:33:56 AM) shawkins: rareddy: ?
(11:34:45 AM) rareddy: shawkins: when I issue with invalidate = true; the load takes let's say 5 min; then issue a query during that intervel, does the query block?
(11:35:11 AM) shawkins: rareddy: on the same node it blocks, on a remote node it will initiate another load
(11:37:24 AM) rareddy: shawkins: ah! ok. so going back to my earlier question, using the invalidate=false is best way for them to keep the data in sync
(11:39:05 AM) shawkins: rareddy: if you are ok with stale data, then yes. however the load is still not coordinated. if another node is issued a refresh during the load, then it too will attempt a load
(11:41:06 AM) sbrooks: rareddy: My test is that the refreshes can be controlled, so i will load a cache onto two nodes then refresh one with invalidate=false, after it's done I will query the second node as see which cache it returns
(11:41:34 AM) rareddy: shawkins: to keep out of "out of sync" then if they issue the "invalidate=false" on one node; then deal with staleness factor, it should keep the results in sync
(11:43:40 AM) sbrooks: rareddy: and they should be fine with the staleness factor so all nodes continue to return results. The only thing that needs to happen is once the refresh completes on one node, all nodes should not use the older version for any future queries
(11:47:19 AM) rareddy: shawkins: based on the feedback sbrooks giving on this subject, do you think we should pursue the "co-ordinated" loads then?
(11:47:54 AM) shawkins: rareddy: that's why there's a TODO
(11:48:01 AM) rareddy: shawkins: may be for 5.2?
(11:48:43 AM) rareddy: shawkins: cool, let 's see what we can do, then
(11:48:54 AM) shawkins: rareddy: we have more implementation options in 5.2 as we can expose jgroups functionality directly
(11:50:40 AM) rareddy: shawkins: you mean for co-ordinate based in events?
(11:52:12 AM) shawkins: rareddy: or jgroup distributed locks
(11:52:30 AM) shawkins: rareddy: whatever makes the most sense
(11:53:23 AM) rareddy: shawkins: ok, do not much about either ones; do we any JIRA to cover this, I can enter one
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 10 months
[JBoss JIRA] Created: (TEIID-1683) Error Code:0 Message: Remote org.teiid.core.TeiidException:Error Code: 0 Message:Symbol EMAIL.EMAILDOCUMENT.MAPPINGCLASSES.EMAIL_1.id is specified with an unknown group context
by Caleb Corliss (JIRA)
Error Code:0 Message: Remote org.teiid.core.TeiidException:Error Code: 0 Message:Symbol EMAIL.EMAILDOCUMENT.MAPPINGCLASSES.EMAIL_1.id is specified with an unknown group context
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Key: TEIID-1683
URL: https://issues.jboss.org/browse/TEIID-1683
Project: Teiid
Issue Type: Bug
Affects Versions: 7.4
Reporter: Caleb Corliss
Assignee: Steven Hawkins
Teiid 7.4 encounters an issue when you try to run a query against an Teiid database mapped via an XML document that requires multiple joins. We are able to successfully query the VDB and return the XML if we map a single join, however, when we attempt to map a second join table we receive the following error:
A little background:
We have a email schema with a variety of attributes and elements (simple and complex.) For example:
Our schema looks similar to this:
<code>
<element name="Email" type="email" />
<complex-type name="email">
<sequence>
<element name="sent" type="dateTime" />
<element name="subject" type="string" />
<element name="address" type="emailAddress" maxOccurs="unbounded" />
<element name="body" type="string" maxOccurs="unbounded" />
</sequence>
<attribute name="id" type="long" use="required" />
</complex-type>
<complex-type name="emailAddress">
<sequence>
<element name="address" type="emailAddress"/>
<element name="type" type="string" />
</sequence>
</complex-type>
</code>
We are able to successfully build the VDB mapping it from the "email" element from the schema. The id from the email object is used from the INPUTS to map the bodies and emailAddress tables to the email document. The XML document is mapped to 3 PostgreSQL tables (email, address, and body.) We are able to run queries that have filter criteria for two tables (the primary and one other;) however, when we try to place criteria against the third we get the following TeiidException:
"Error Code:0 Message: Remote org.teiid.core.TeiidException:Error Code: 0 Message:Symbol EMAIL.EMAILDOCUMENT.MAPPINGCLASSES.EMAIL_1.id is specified with an unknown group context"
For example the following "mock" queries would work just fine:
<code>SELECT * FROM emailDocument WHERE subject like '%' OR body like '%'</code>
<code>SELECT * FROM emailDocument WHERE subject like '%' OR emailAddress.address like '%'</code>
However, this query would fail and throw the exception specified above.
<code>SELECT * FROM emailDocument WHERE subject like '%' OR body like '%' OR emailAddress.address like '%'</code>
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 10 months
[JBoss JIRA] Created: (TEIID-1652) ODBC Data Row results should be batched
by Ramesh Reddy (JIRA)
ODBC Data Row results should be batched
---------------------------------------
Key: TEIID-1652
URL: https://issues.jboss.org/browse/TEIID-1652
Project: Teiid
Issue Type: Enhancement
Components: ODBC
Affects Versions: 7.1.1
Reporter: Ramesh Reddy
Assignee: Ramesh Reddy
Fix For: 7.4.1, 7.5, 7.1.1
Teiid currently writes one row at a time into the wire to send the ODBC data rows. Since the ODBC driver is capable of reading a stream of rows, it will be performant to batch multiple rows into single buffer before they can written to the wire. This will reduce the network fragmentation and result in fewer number of round trips and helps of network latency.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
12 years, 10 months