[teiid-issues] [JBoss JIRA] (TEIID-2429) Large sort performance

Steven Hawkins (JIRA) jira-events at lists.jboss.org
Thu Mar 21 16:41:42 EDT 2013


     [ https://issues.jboss.org/browse/TEIID-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Steven Hawkins resolved TEIID-2429.
-----------------------------------

    Resolution: Done


changed the overhead handling and improved the copy into memory from file.  also switched to a more accurate and foolproof tracking of reservations via the commandcontext (ideally we'll move away from the threadlocal later but that would have been significantly more changes). updated the release notes.

local testing shows a significant increase in performance for truly large sorts (those making use of disk). however a significant amount of testing will be needed and there is a degredation in performance of smaller primary/secondary cache only sorts - this is something that may still need to be addressed as under heavy load the old strategy did not proactively persist batches enough, but under light load with the new logic we persist batches too quickly (generally this means that the cleaner is running all the time and we specifically two eviction queues in the buffer manager).

also added reverse iteration during the initial sorting phase to better purge/retrieve batches from cache - however this means that larger sorts will not quite be in the same order (not stable).  this is allowed by the spec, but we'll have to see if the apparent performance gain warrents a behavioral change.

                
> Large sort performance
> ----------------------
>
>                 Key: TEIID-2429
>                 URL: https://issues.jboss.org/browse/TEIID-2429
>             Project: Teiid
>          Issue Type: Quality Risk
>          Components: Query Engine
>    Affects Versions: 7.4
>            Reporter: Steven Hawkins
>            Assignee: Steven Hawkins
>             Fix For: 8.4
>
>
> Large sorts (high data volume over above several hundred thousand rows) experience a disproportionate performance degradation as the data set grows larger.
> This is due to the SortUtility default collection strategy that will create intermediate sort buffers too proactively. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


More information about the teiid-issues mailing list