Per Ramesh's suggestion, posting the conversation we had on this topic...




Sridhar wrote...

Hi Ramesh

Great blog! Great concept! However, I have few questions.

I am really curious about the performances. If you think about any data integration project, we run into bottle necks very frequently due to the volume of data. I'd like to know how Teiid address this area, let me know if you have any blog on this topic?

One other question is....assume that there are 2 Oracle database servers on which data resides and needs to be integrated. What additional value does Teiid provide when compared to using Oracle dblinks? I understand, the biggest value add of Teiid server is to integrate disparate data sources (Oracle, MySQL, SQLServer...etc). But, for this argument sake, let's just take 2 Oracle databases servers for integrating.

Please let me know your thoughts.


cheers
Sridhar


Ramesh wrote...

> I am really curious about the performances. If you think about any
> data integration project, we run into bottle necks very frequently due
> to the volume of data. I'd like to know how Teiid address this area,
> let me know if you have any blog on this topic?
>
During integration Teiid uses "batching" of tuple sources. What it means
is it breaks down the results in to multiple sets of manageable size
such that it can load/unload variable size of the results in and out of
memory easily to handle very large sets.

I will not say that performance is as good as query going to the native
system, but Teiid goes to great lengths to minimize the latency
introduced by its processing. Like the batching above, prepared
statements, caching, join optimizations etc. Yes, it is great blog
topic, we will put something out soon. Mean while I encourage to test it
out.

> One other question is....assume that there are 2 Oracle database
> servers on which data resides and needs to be integrated. What
> additional value does Teiid provide when compared to using Oracle
> dblinks? I understand, the biggest value add of Teiid server is to
> integrate disparate data sources (Oracle, MySQL, SQLServer...etc).
> But, for this argument sake, let's just take 2 Oracle databases
> servers for integrating.
>
I do not have any experience with Oracle dblinks, so I do not know.

Thanks

Ramesh..


Sridhar wrote...

Ramesh

Thanks for the response. When you say Teiid uses 'batching', do you recommend Teiid for real-time applications, where a query is fired requesting Teiid to fetch data across different databases?

Please keep me updated of your blogs.

cheers
Sridhar


Ramesh wrote...

Absolutely. Teiid is designed for the real-time solutions. If your
application does not need real-time then you need to look into ETL
tools.

"batching" does *not* mean batch processing. Batching in Teiid's lingo
means working with a *set* of results. Let's say a Table has 1M rows, we
do not load all the rows into memory

1) Teiid trys to get only rows that are needed by query by pushing the
criteria into source system to get as few rows as needed.

2) Only work with about say ~2000 rows at time in the memory, so that
there are no out of memory issues. Then it will swap these results for
next set of rows, so it only works on a "batch" of results at a time.
The rest of results are either stored to disk temporarily or thrown out
if they did not meet the criteria.

Hope this helps.

Ramesh..

PS: if you do not mind from next time can you post to the user forums,
so that if any other user has similar questions, it would help them out
and have more people can provide input. Thanks.





On Wed, Sep 2, 2009 at 9:56 AM, Ramesh Reddy <rareddy@redhat.com> wrote:
If you are not subscribed to Teiid's blog check out a new posting at

http://teiid.blogspot.com/

Thanks.

Ramesh..

_______________________________________________
teiid-users mailing list
teiid-users@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/teiid-users