[
https://issues.jboss.org/browse/TEIID-3219?page=com.atlassian.jira.plugin...
]
Kylin Soong edited comment on TEIID-3219 at 11/20/14 4:45 AM:
--------------------------------------------------------------
I have done some investigation, [1] are the sample sheets I used for testing. The poi
parse the excel sheet will trim all null rows both bottom and top, only keep non-null rows
as sheet data, if .xlsx file contains any of the formatting for the blank cells, the poi
reading is not treating it as null, like test.xlsx in [1].
take any document, just scroll empty doc and save and test.
I have tested both in windows with Excel 2007 and linux with OpenOffice, only if you
scroll cell has style then it may cause empty iterations, like test.xlsx in [1], I have
add style left-justified horizontal alignment, the poi reading isn't thinking it not
null, all 20 rows be loaded, the cell with null value but have style as below:
~~~
<main:sheetData>
<main:row r="1" spans="1:4">
<main:c r="A1" s="1"/>
<main:c r="B1" s="1"/>
<main:c r="C1" s="1"/>
<main:c r="D1" s="1"/>
</main:row>
~~~
If you scroll a fresh empty sheet, you can't see that behavior.
the tricky thing in this test document is, there are many cells, but
table is picking up that single cell as column
Assuming a sheet have 10 rows, each row have multiple cells, but all cells in 5th row are
null, in the condition, the current excel tranlator only retun the first 4 rows, start
from 5th row be discarded, actually we expected result is all 10 rows be returned and all
colmuns in row 5 are null, this is what we should resolve in this issue.
So far, I think current fix is fine, no need submitting another patch for regression
concern. The current tranlator use poi usermodel api, all rows be loaded to heap then
tranlator interator all rows, from tranlator side, the last row should be
sheet.getLastRowNum(), no need to determine which is the last.
[1]
https://github.com/kylinsoong/data/tree/master/poi-examples-teiid/file
was (Author: kylin):
I have done some investigation, [1] are the sample sheets I used for testing. The poi
parse the excel sheet will trim all null rows both bottom and top, only keep non-null rows
as sheet data, if .xlsx file contains any of the formatting for the blank cells, the poi
reading is not treating it as null, like test.xlsx in [1].
take any document, just scroll empty doc and save and test.
I have tested both in windows with Excel 2007 and linux with OpenOffice, only if you
scroll cell has style then it may cause empty iterations, like test.xlsx in [1], I have
add style left-justified horizontal alignment, the poi reading isn't thinking it not
null, all 20 rows be loaded, the cell with null value but have style as below:
~~~
<main:sheetData>
<main:row r="1" spans="1:4">
<main:c r="A1" s="1"/>
<main:c r="B1" s="1"/>
<main:c r="C1" s="1"/>
<main:c r="D1" s="1"/>
</main:row>
~~~
If you scroll a fresh empty sheet, you can't see that behavior.
I have done some investigation, [1] are the sample sheets I used for
testing. The poi parse the excel sheet will trim all null rows both bottom and top, only
keep non-null rows as sheet data, if .xlsx file contains any of the formatting for the
blank cells, the poi reading is not treating it as null, like test.xlsx in [1].
take any document, just scroll empty doc and save and test.
I
have tested both in windows with Excel 2007 and linux with OpenOffice, only if you scroll
cell has style then it may cause empty iterations, like test.xlsx in [1], I have add style
left-justified horizontal alignment, the poi reading isn't thinking it not null, all
20 rows be loaded, the cell with null value but have style as below:
~~~
<main:sheetData>
<main:row r="1" spans="1:4">
<main:c r="A1" s="1"/>
<main:c r="B1" s="1"/>
<main:c r="C1" s="1"/>
<main:c r="D1" s="1"/>
</main:row>
~~~
If you scroll a fresh empty sheet, you can't see that behavior.
the tricky thing in this test document is, there are many cells, but
table is picking up that single cell as column
Assuming a sheet have 10 rows, each row have multiple cells, but all cells in 5th row are
null, in the condition, the current excel tranlator only retun the first 4 rows, start
from 5th row be discarded, actually we expected result is all 10 rows be returned and all
colmuns in row 5 are null, this is what we should resolve in this issue.
[1]
https://github.com/kylinsoong/data/tree/master/poi-examples-teiid/file
Excel Translator Wrong Result From Columns With Interleaved Null
Values
-----------------------------------------------------------------------
Key: TEIID-3219
URL:
https://issues.jboss.org/browse/TEIID-3219
Project: Teiid
Issue Type: Bug
Components: Misc. Connectors
Affects Versions: 8.7.1
Reporter: Jan Stastny
Assignee: Kylin Soong
Fix For: 8.7.1, 8.10
Attachments: artifacts.zip
When querying a column which has a null value somewhere in it, the result set is missing
all the rows under the null value in the xlsx file. When querying more columns, the result
set is missing some values only if there is a row, where both columns contains null value.
--
This message was sent by Atlassian JIRA
(v6.3.8#6338)