February 2015 - hibernate-dev - Jboss List Archives

Re: [hibernate-dev] HHH-6726 LONG and LONG RAW column types in Oracle

by Łukasz Antoniak

Currently Oracle supports database versions from 10.1 to 11.2 [1]. LONG and LONG RAW data types are deprecated since version 8 and 8i (released before September 2000) [2]. Oracle keeps those column types only for backward compatibility [3]. I tried the following scenario (Oracle 10gR2): 1. Create schema with "hibernate.hbm2ddl.auto" set to "create". The LONG column is created. 2. Insert some data. 3. Modify Oracle dialect as Gail suggested. Avoid setting "hibernate.hbm2ddl.auto". 4. Insert some data. To my surprise the test actually passed :). However, I think that we cannot guaranty the proper behavior in every situation. As for performance, ImageType is extracted by calling ResultSet.getBytes() method, which fetches all data in one call [4]. I don't suppose a major performance difference when data is streamed in another call. oracle.jdbc.driver.LongRawAccessor.getBytes also fetches data by reading the stream. The bug reading LONG column affects JDBC drivers since version 10.2.0.4. I think that we have to choose between: - changing Oracle10gDialect. Make a not about it in migration guide to 4.0 and update "5.2.2. Basic value types" chapter in Hibernate documentation. - introducing Oracle11gDialect. It can sound weird to access Oracle 10g database with Oracle 11g dialect. - disabling execution of Hibernate tests that fail because of this issue with @SkipForDialect (and maybe develop another version of them with CLOBs and BLOBs, @RequiresDialect). Hibernate is written correctly according to "Default Mappings Between SQL Types and Java Types" (referenced earlier by Gail) and this is more Oracle's JDBC implementation issue. This option came to my mind, but it's weird :P. I would vote for the first option. Regards, Lukasz Antoniak [1] http://www.oracle.com/us/support/library/lifetime-support-technology-0691... (page 4) [2] http://download.oracle.com/docs/cd/A91202_01/901_doc/server.901/a90120/ch... [3] http://download.oracle.com/docs/cd/B28359_01/server.111/b28318/datatype.htm [4] "Getting a LONG RAW Data Column with getBytes" http://download.oracle.com/docs/cd/B19306_01/java.102/b14355/jstreams.htm Strong Liu pisze: > I think oracle 11g is the only one supported DB version by oracle, can we just introduce a new oracle dialect with suggested changes, and deprecate all other existed oracle dialects? this won't affects users app > > ----------- > Strong Liu <stliu(a)hibernate.org> > http://hibernate.org > http://github.com/stliu > > On Oct 15, 2011, at 11:14 AM, Scott Marlow wrote: > >> How does this impact existing applications? Would they have to convert >> LONGs to CLOBs (and LONGRAWs to BLOBs) to keep the application working? >> >> As far as the advantage of CLOB over TEXT, if you read every character, >> which one is really faster? I would expect TEXT to be a little faster, >> since the server side will send the characters before they are asked >> for. By faster, I mean from the application performance point of view. :) >> >> Could this be changed in a custom Oracle dialect? So new >> applications/databases could perhaps use that and existing applications >> might use LONGs a bit longer via the existing Oracle dialect. >> >> On 10/14/2011 09:22 PM, Gail Badner wrote: >>> In [1], I am seeing the following type mappings: >>> >>> Column type: LONG -> java.sql.Types.LONGVARCHAR -> java.lang.String >>> Column type: LONGRAW -> java.sql.Types.LONGVARBINARY -> byte[] >>> >>> org.hibernate.type.TextType is consistent with the mapping for LONG. >>> >>> org.hibernate.type.ImageType is consistent with the mapping for LONGRAW. >>> >>> From this standpoint, the current settings are appropriate. >>> >>> I understand there are restrictions when LONG and LONGRAW are used and I see from your other message that there is Oracle documentation for migrating to CLOB and BLOB. >>> >>> I agree that changing column type registration as follows (for Oracle only) should fix this: >>> registerColumnType( Types.VARBINARY, 2000, "raw($l)" ); >>> registerColumnType( Types.VARBINARY, "blob" ); >>> >>> registerColumnType( Types.LONGVARCHAR, "clob" ); >>> registerColumnType( Types.LONGVARBINARY, "blob" ); >>> >>> registerColumnType( Types.VARCHAR, 4000, "varchar2($l char)" ); >>> registerColumnType( Types.VARCHAR, "clob" ); >>> >>> Steve, what do you think? Is it too late to make this change for 4.0.0? >>> >>> [1] Table 11-1 of Oracle® Database JDBC Developer's Guide and Reference, 11g Release 1 (11.1) (http://download.oracle.com/docs/cd/B28359_01/java.111/b31224/datacc.htm#g...) >>> [2] Hibernate Core Migration Guide for 3.5 (http://community.jboss.org/wiki/HibernateCoreMigrationGuide35) >>> [3] Table 2-10 of Oracle® Database SQL Language Reference >>> 11g Release 1 (11.1) (http://download.oracle.com/docs/cd/B28359_01/server.111/b28286/sql_elemen...) >>> >>> ----- Original Message ----- >>>> From: "Łukasz Antoniak"<lukasz.antoniak(a)gmail.com> >>>> To: hibernate-dev(a)lists.jboss.org >>>> Sent: Thursday, October 13, 2011 12:50:13 PM >>>> Subject: [hibernate-dev] HHH-6726 LONG and LONG RAW column types in Oracle >>>> >>>> Welcome Community! >>>> >>>> I have just subscribed to the list and wanted to discuss HHH-6726 >>>> JIRA >>>> issue. >>>> >>>> Gail Badner wrote >>>> (http://lists.jboss.org/pipermail/hibernate-dev/2011-October/007208.html): >>>> HHH-6726 (Oracle : map TextType to clob and ImageType to blob) >>>> https://hibernate.onjira.com/browse/HHH-6726 >>>> There have been a number of issues opened since the change was made >>>> to >>>> map TextType (LONGVARCHAR) 'long' and ImageType (LONGVARBINARY) to >>>> 'long >>>> raw'. This change was already documented in the migration notes. >>>> Should >>>> the mapping for Oracle (only) be changed back to clob and blob? >>>> >>>> HHH-6726 is caused by an issue in Oracle JDBC driver (version >>>> 10.2.0.4 >>>> and later). This bug appears when LONG or LONG RAW columns are >>>> accessed >>>> not as first or last while processing SQL statement. >>>> >>>> I have discussed the topic of mapping TextType to CLOB and ImageType >>>> to >>>> BLOB (only in Oracle dialect) with Strong Liu. Reasons for doing so: >>>> - Oracle allows only one LONG / LONG RAW column per table. This might >>>> be >>>> the most important from Hibernate's perspective. >>>> - LONG / LONG RAW - up to 2 GB, BLOB / CLOB - up to 4 GB. >>>> - In PL/SQL using LOBs is more efficient (random access to data). >>>> LONG >>>> only sequential. >>>> - LONG and LONG RAW are deprecated. >>>> >>>> What is your opinion? >>>> >>>> Regards, >>>> Lukasz Antoniak >>>> _______________________________________________ >>>> hibernate-dev mailing list >>>> hibernate-dev(a)lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/hibernate-dev >>>> >>> _______________________________________________ >>> hibernate-dev mailing list >>> hibernate-dev(a)lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/hibernate-dev >> _______________________________________________ >> hibernate-dev mailing list >> hibernate-dev(a)lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/hibernate-dev >

8 years, 11 months

5
5
0 / 0

Query handling : Antlr 3 versus Antlr 4

by Steve Ebersole

As most of you know already, we are planning to redesign the current Antlr-based HQL/JPQL parser in ORM for a variety of reasons. The current approach in the translator (Antlr 2 based, although Antlr 3 supports the same model) is that we actually define multiple grammars/parsers which progressively re-write the tree adding more and more semantic information; think of this as multiple passes or phases. The current code has 3 phases: 1) parsing - we simply parse the HQL/JPQL query into an AST, although we do do one interesting (and uber-important!) re-write here where we "hoist" the from clause in front of all other clauses. 2) rough semantic analysis - the current code, to be honest, sucks here. The end result of this phase is a tree that mixes normalized semantic information with lots of SQL fragments. It is extremely fugly 3) rendering to SQL The idea of phases is still the best way to attack this translation imo. I just think we did not implement the phases very well before; we were just learning Antlr at the time. So part of the redesign here is to leverage our better understanding of Antlr and design some better trees. The other big reason is to centralize the generation of SQL into one place rather than the 3 different places we do it today (not to mention the many, many places we render SQL fragments). Part of the process here is to decide which parser to use. Antlr 2 is ancient :) I used Antlr 3 in the initial prototyping of this redesign because it was the most recent release at that time. In the interim Antlr 4 has been released. I have been evaluating whether Antlr 4 is appropriate for our needs there. Antlr 4 is a pretty big conceptual deviation from Antlr 2/3 in quite a few ways. Generally speaking, Antlr 4 is geared more towards interpreting rather than translating/transforming. It can handle "transformation" if the transformation is the final step in the process. Transformations is where tree re-writing comes in handy. First lets step back and look at the "conceptual model" of Antlr 4. The grammar is used to produce: 1) the parser - takes the input and builds a "parse tree" based on the rules of the lexer and grammar. 2) listener/visitor for parse-tree traversal - can optionally generate listeners or visitors (or both) for traversing the parse tree (output from parser). There are 2 highly-related changes that negatively impact us: 1) no tree grammars/parsers 2) no tree re-writing Our existing translator is fundamentally built on the concepts of tree parsers and tree re-writing. Even the initial prototypes for the redesign (and the current state of hql-parser which Sanne and Gunnar picked up from there) are built on those concepts. So moving to Antlr 4 in that regard does represent a risk. How big of a risk, and whether that risk is worth it, is what we need to determine. What does all this mean in simple, practical terms? Let's look at a simple query: "select c.headquarters.state.code from Company c". Simple syntactic analysis will produce a tree something like: [QUERY] [SELECT] [DOT] [DOT] [DOT] [IDENT, "c"] [IDENT, "headquarters"] [IDENT, "state"] [IDENT, "code"] [FROM] [SPACE] [SPACE_ROOT] [IDENT, "Customer"] [IDENT, "c"] There is not a lot of semantic (meaning) information here. A more semantic representation of the query would look something like: [QUERY] [SELECT] [ATTRIBUTE_REF] [ALIAS_REF, "<gen:1>"] [IDENT, "code"] [FROM] [SPACE] [PERSISTER_REF] [ENTITY_NAME, "com.acme.Customer"] [ALIAS, "c"] [JOIN] [INNER] [ATTRIBUTE_JOIN] [IDENT, "headquarters"] [ALIAS, "<gen:0>"] [JOIN] [INNER] [ATTRIBUTE_JOIN] [IDENT, "state"] [ALIAS, "<gen:1>"] Notice especially the difference in the tree rules. This is tree re-writing, and is the major difference affecting us. Consider a specific thing like the "c.headquarters.state.code" DOT-IDENT sequence. Essentially Antlr 4 would make us deal with that as a DOT-IDENT sequence through all the phases - even SQL generation. Quite fugly. The intent of Antlr 4 in cases like this is to build up an external state table (external to the tree itself) or what Antlr folks typically refer to as "iterative tree decoration"[1]. So with Antlr 4, in generating the SQL, we would still be handling calls in terms of "c.headquarters.state.code" in the SELECT clause and resolving that through the external symbol tables. Again, with Antlr 4 we would always be walking that initial (non-semantic) tree. Unless I am missing something. I would be happy to be corrected, if anyone knows Antlr 4 better. I have also asked as part of the antlr-discussion group[2]. In my opinion though, if it comes down to us needing to walk the tree in that first form across all phases I just do not see the benefit to moving to Antlr 4. P.S. When I say SQL above I really just mean the target query language for the back-end data store whether that be SQL targeting a RDBMS for ORM or a NoSQL store for OGM. [1] I still have not fully grokked this paradigm, so I may be missing something, but... AFAICT even in this paradigm the listener/visitor rules are defined in terms of the initial parse tree rules rather than more [2] https://groups.google.com/forum/#!topic/antlr-discussion/hzF_YrzfDKo

9 years, 6 months

3
13
0 / 0

Changelog file in Hibernate ORM

by Sanne Grinovero

The file changelog.txt in the root ot the Hibernate ORM project seems outdated. Is it not maintained anymore? I found it handy. Sanne

9 years, 7 months

8
27
0 / 0

JPA PessimisticLockScope.EXTENDED specs are not properly implemented

by Mihalcea Vlad

Hi, According to JPA specification, when using PessimisticLockScope.EXTENDED: In addition to the behavior for PessimisticLockScope.NORMAL, element collections and relationships owned by the entity that are contained in join tables will be locked if the javax.persistence.lock.scope property is specified with a value of PessimisticLockScope.EXTENDED. The state of entities referenced by such relationships will not be locked (unless those entities are explicitly locked). Locking such a relationship or element collection generally locks only the rows in the join table or collection table for that relationship or collection. This means that phantoms will be possible. So if I define the following Parent/Child associations:@Entity public class Post { @Id @GeneratedValue(strategy= GenerationType.IDENTITY) private Long id; private String name; @ElementCollection @JoinTable(name = "post_comments", joinColumns = @JoinColumn(name = "post_id")) @OrderColumn(name = "comment_index") private List<Comment> comments = new ArrayList<Comment>(); @Version private int version; public Long getId() { return id; } public void setId(Long id) { this.id = id; } public String getName() { return name; } public void setName(String name) { this.name = name; } public List<Comment> getComments() { return comments; } public final int getVersion() { return version; } public void addComment(Comment comment) { comments.add(comment); } } @Embeddable public class Comment { private String review; public String getReview() { return review; } public void setReview(String review) { this.review = review; } } Because a Post has a collection of Comment components, an EXTENDED lock request should lock the joined table rows as well, without locking the actual Comment row-level locks:Post post = entityManager.find(Post.class, parentId); entityManager.lock(post, LockModeType.PESSIMISTIC_WRITE, Collections.singletonMap("javax.persistence.lock.scope", (Object) PessimisticLockScope.EXTENDED)); return null; But the output SQL looks like this:Unable to find source-code formatter for language: shell. Available languages are: actionscript, html, java, javascript, none, sql, xhtml, xmlINFO [main]: n.t.d.l.CommonsQueryLoggingListener - Name:, Time:0, Num:1, Query:{[select post0_.id as id1_16_0_, post0_.name as name2_16_0_, post0_.version as version3_16_0_ from Post post0_ where post0_.id=?][1]} INFO [main]: n.t.d.l.CommonsQueryLoggingListener - Name:, Time:1, Num:1, Query:{[select id from Post where id =? and version =? for update][1,0]} So, only the root Entity was locked, when we should have locked the post_comments associated rows as well. Code available on GitHub. You have to run the HibernateCascadeLockComponentTest test. This issue is also described on JIRA: https://hibernate.atlassian.net/browse/HHH-9636 Vlad Mihalcea

9 years, 10 months

1
0
0 / 0

[OGM] Releases

by Gunnar Morling

Hi, As the first week of our OGM sprint ends, it's about time to do a release. For 4.1.2 we definitely have enough to warrant a release. Sanne, Hardy, do you guys have any nearly completed changes you'd like to see in? If so, please let me know. If I don't hear back, I'll kick off the release early in the afternoon. With the number of issues remaining, I think a subsequent 4.1.3 release would be a good move. For 4.2 it makes sense to wait a bit IMO, so we can get in Davide's changes around embeddables and a first cut of the error handling SPI. Any thoughts? Thanks, --Gunnar

9 years, 10 months

3
3
0 / 0

Branching strategy in OGM

by Sanne Grinovero

Hey all, it seems like the branch for maintenance work on OGM 4.1 is (still) called "master", while a branch "4.2" was created for future work. I'd really prefer it the other way around: create a branch "4.1" to host all changes which are needed to be backported on 4.1.x , and call "master" what will receive all of the latest improvements. Let's see on IRC when it is a good time to rename the branches? It better happens "atomically" or it's a mess.. Sanne

9 years, 10 months

3
8
0 / 0

Is HSEARCH-1786/HSEARCH-1814 on the radar

by Marc Schipperheyn

Hi, Just wondering if HSEARCH-1786/HSEARCH-1814 are on the radar? We have stopped all of our upgrade efforts because of this. And if they are valid bugs, it basically means that HSEARCH 5.0, and 5.01 are unreliable for production use.` I'm just wondering if it's considered a real bug, if so it's a blocker, or perhaps a figment of imagination? I know I have made my share of erronous bug reports... Cheers, Marc

9 years, 10 months

2
1
0 / 0

Naming and "naming strategies"

by Steve Ebersole

As I am working on 5.0, one of the things I am trying to accomplish is to make the handling of table/column names more consistent and better defined. The first step in that is to properly define the terms used often throughout the codebase. The first level of naming is the "given" name of a table/column. The given name might be: * explicit - explicitly specified by the user, as in @Table( name="explicit_name" ) * implicit - not explicitly specified by the user and thus implicitly determined (by JPA rules, "naming strategy", etc). Next, we have a logical name which is a normalized form of the "given" name. This is the form used to reference tables/columns internally. E.g., its how we resolve @Column(..., table="xyz"). More on this form later. Finally we have the physical name of the thing, which is the actual name of the table/column in the database. Again, this is generally a normalization of the given name based on Dialect, "naming strategy", etc. Today, we have a very messy concept called a NamingStrategy. I say it is messy because it tries to combine unrelated concerns. So I still plan to split this as I have outlined elsewhere into: 1) ImplicitNamingStrategy 2) PhysicalNamingStrategy Which brings up my first question to y'all. Do we need a contract for LogicalNamingStrategy? As I have said, the logical names are the things used to resolve references. Allowing people to plug in custom strategies for how that normalization works could be very dangerous. But even more than that, is it really interesting to be able to hook into that process? Historically, these names are all represented by String. So I also propose to shift this to use that Identifier class we developed for the metamodel redesign. For those that may be unfamiliar, it essentially combines the String name with a "quoted" boolean: public class Identifier { private final String text; private final boolean isQuoted; ... } Table names, then, are an aggregation of 3 Identifiers: one for catalog, one for schema, one for table name. Same for named constraints (ultimately, which is part of a improvement for 6.0 to allow indexes, constraints, etc to be created in a separate schema from tables). Since a major goal for 5.0 is to continue to use the org.hibernate.mapping package as the representation of the mapping information, we obviously want to minimize changes there to only what is completely essential. To that end, if we are going to use Identifier over String stuff in the org.hibernate.mapping package will need to deal with both; internally they will hold the Identifier and use that to implement the String-based name-related methods they expose. Lastly I wanted to discuss the details of the logical names. For tables, we currently qualify the table name with the catalog/schema info. There is a mismatch in this regard when it comes to remaining a pure JPA implementation. Consider @Column( ..., table="some_table"). Ultimately we need to be able to qualify that with catalog/schema in order to be able to construct a matching logical name (to be able to pair that with the referenced org.hibernate.mapping.Table later). This is trivial when table names are unique across all the catalogs/schemas (when there is only one "some_table" in all the mapped catalogs/schemas). But is poses a problem when the same table name is used from different schemas (e.g., when "some_table" is mapped from both "schema1" and "schema2"). So we have a choice. Since JPA does not say what is legal/illegal for the @Column.table attribute, it is feasible for us to allow @Column.table to contain the catalog/schema information in these cases as a selector. The only other option I can see is to define a limitation that says that a table name must be unique for a given entity across all catalogs/schemas. I don't think that is generally a restrictive limitation. What are y'alls thoughts? Perhaps this is one argument for allowing pluggable LogicalNamingStrategy?

9 years, 10 months

6
30
0 / 0

[OGM] Sprint organization

by Emmanuel Bernard

I have walked through all of the issues and sorted what I think should be done. ## Plan of action Within these 3 weeks, we will do: * 4.1.2 (bug fixes) -> code freeze Friday 27th * 4.2 Alpha1 -> code freeze Friday 27th * 4.2 Alpha2 or Beta we will see -> code freeze Friday 6th * 4.2 Beta or CR we will see -> code freeze Wed 11th The rookie team + me will work on the 4.1.2 tagged issues. This work will continue even past the release of 4.1.2 but will be likely incorporated into the 4.2 branch. Davide and Gunnar will work on the 4.2 tagged issues. It might make sense to have a single one focus on the numerous query improvements and the other on the rest. See below. Take the top issue of the list ordered by decreasing priority or one with the same level of priority. Kill it. Move to the next. ## Priorities For 4.1.2, I have sorted issues by decreasing importance with one little twist. All issues created after a direct user feedback is marked blocker. This is to speed our feedback loop time. https://hibernate.atlassian.net/browse/OGM-740?jql=project%20%3D%20OGM%20... For 4.2, it is also sorted by decreasing importance but I also gave the release a theme. The main focus is significantly improving JP-QL queries. Secondary and must have target is the error report API. Tertiary is everything else. That includes some improvements in ORM itself to unlock us. There are too many tasks to address them all but let's first focus on making a dent up to the major priority level. Remarks? Emmanuel

9 years, 10 months

5
17
0 / 0

[OGM] Moving to Java 7

by Gunnar Morling

Hi, So far Hibernate OGM has been usable quite well when running on Java 6. But as more and more of our dependencies require Java 7, the number of cases where 6 is sufficient, decreases. With two out of four backends (Infinispan and Neo4j) as well as HSEARCH requiring 7 by now, I think it's about time to move on and take advantage of 7 ourselves as well. Are there any objections? Thanks, --Gunnar

9 years, 10 months

4
4
0 / 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

hibernate-dev February 2015