[gatein-issues] [JBoss JIRA] Commented: (GTNPORTAL-1870) Login time of user "root" depends on number of users in DB

Tue Apr 26 03:18:18 EDT 2011

    [ https://issues.jboss.org/browse/GTNPORTAL-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597870#comment-12597870 ] 

Boleslaw Dawidowicz commented on GTNPORTAL-1870:
------------------------------------------------

Without looking IIRC it will probably require adding new specific SPI method to avoid unnecessary queries. Probably won't have time to dig much this week but we'll try to look soon. 

> Login time of user "root" depends on number of users in DB
> ----------------------------------------------------------
>
>                 Key: GTNPORTAL-1870
>                 URL: https://issues.jboss.org/browse/GTNPORTAL-1870
>             Project: GateIn Portal
>          Issue Type: Bug
>      Security Level: Public(Everyone can see) 
>          Components: Identity integration, Performance
>    Affects Versions: 3.1.0-GA
>         Environment: EPP 5.1.0.GA
>            Reporter: Marek Posolda
>            Assignee: Boleslaw Dawidowicz
>             Fix For: 3.2.0-GA
>
>
> When we have 100000 users in DB, then login of user root took me 3 minutes on my laptop. With million users in DB is login of user root impossible . The cause is that method GroupDAOImpl.getAllGroups() is called for user "root" and CPU time of this method depends on number of users in database. This method has 2 possible bottlenecks:
> 1) Method GroupDAOImpl.getAllGroups() is internally running RelationshipManagerImpl.findAssociatedGroups for root group, which in next turn means recursive call of method RelationshipManagerImpl.findAssociatedGroups for each group in DB. This method is sending SQL, which is looking for all children under specified group. Problem is that these children can be both users or other groups and users are then filtered and removed from final result by algorithm at the end of method RelationshipManagerImpl.findAssociatedGroups(). 
> When having 1000000 users and method  RelationshipManagerImpl.findAssociatedGroups() is called for IDM group equivalent to "/platform/users", then we have 1000000 objects in result from Hibernate call. And this is the cause of bottleneck. This SQL is called (4 is ID of group "users" in table "jbid_io") :
> select distinct hibernatei1_.ID as ID4_, hibernatei1_.IDENTITY_TYPE as IDENTITY2_4_, hibernatei1_.NAME as NAME4_, hibernatei1_.REALM as REALM4_ from jbid_io_rel hibernatei0_ inner join jbid_io hibernatei1_ on hibernatei0_.TO_IDENTITY=hibernatei1_.ID, jbid_io_rel_type hibernatei3_ where hibernatei0_.REL_TYPE=hibernatei3_.ID and (hibernatei1_.NAME like '%') and hibernatei3_.NAME='JBOSS_IDENTITY_MEMBERSHIP' and hibernatei0_.FROM_IDENTITY=4;
> Running of this SQL is expensive and another thing, which I am seeing from profiling, is that big amount of CPU time is also spend by caching of this result in Hibernate Query Cache (which is very bad for memory as well because it needs to cache 1000000 HibernateIdentityObject instances with all users).
> 2) Second thing in GroupDAOImpl.getAllGroups() is calling of method convertGroup(), which needs to call DB requests to obtain attributes for concrete group. Hibernate is using these selects (for group with ID 1) for obtain attributes:
> select attributes0_.IDENTITY_OBJECT_ID as IDENTITY2_4_1_, attributes0_.ATTRIBUTE_ID as ATTRIBUTE1_1_, attributes0_.ATTRIBUTE_ID as ATTRIBUTE1_9_0_, attributes0_.IDENTITY_OBJECT_ID as IDENTITY2_9_0_, attributes0_.NAME as NAME9_0_, attributes0_.ATTRIBUTE_TYPE as ATTRIBUTE4_9_0_, attributes0_.BIN_VALUE_ID as BIN5_9_0_ from jbid_io_attr attributes0_ where attributes0_.IDENTITY_OBJECT_ID in (select hibernatei1_.ID from jbid_io_rel hibernatei0_ inner join jbid_io hibernatei1_ on hibernatei0_.TO_IDENTITY=hibernatei1_.ID, jbid_io_rel_type hibernatei3_ where hibernatei0_.REL_TYPE=hibernatei3_.ID and (hibernatei1_.NAME like '%') and hibernatei3_.NAME='JBOSS_IDENTITY_MEMBERSHIP' and hibernatei0_.FROM_IDENTITY=1);
> select textvalues0_.TEXT_ATTR_VALUE_ID as TEXT1_9_0_, textvalues0_.ATTR_VALUE as ATTR2_0_ from jbid_io_attr_text_values textvalues0_ where textvalues0_.TEXT_ATTR_VALUE_ID in (select attributes0_.ATTRIBUTE_ID from jbid_io_attr attributes0_ where attributes0_.IDENTITY_OBJECT_ID in (select hibernatei1_.ID from jbid_io_rel hibernatei0_ inner join jbid_io hibernatei1_ on hibernatei0_.TO_IDENTITY=hibernatei1_.ID, jbid_io_rel_type hibernatei3_ where hibernatei0_.REL_TYPE=hibernatei3_.ID and (hibernatei1_.NAME like '%') and hibernatei3_.NAME='JBOSS_IDENTITY_MEMBERSHIP' and hibernatei0_.FROM_IDENTITY=1));
> When profiling, I am seeing that these selects are slow because attributes of groups are in same DB table as attributes of user ( When we have million users and each user has 7 attributes, then we have around 7000000 items in tables "jbid_io_attr" and in table "jbid_io_attr_text_values" )

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira