[dna-dev] Several interesting InfoQ articles about JCR

Thu Jun 12 21:25:13 EDT 2008

Indeed - well good to hear.

On Fri, Jun 13, 2008 at 11:21 AM, Randall Hauch <rhauch at redhat.com> wrote:
> I saw this within my first tests of Jackrabbit.  I created several load
> tests to create wide, deep, and wide+deep trees (of varying sizes).  Once I
> new my tests worked correctly, I cranked up the sizes and waited.  And
> waited.  And waited some more.  The times for the even moderately wide trees
> were unimpressive.
>
> Sometimes its nice to not be first - we're very lucky to be able to learn
> from their mistakes. :-)
>
>
> On Jun 12, 2008, at 8:15 PM, Michael Neale wrote:
>
>> Thats great ! Yes jackrabbit has some serious limitations in its
>> persistence. It seems to work OK for deep trees, but even so, there
>> are architectural challenges with this.
>>
>> It sounds ilke you guys have seen this and are tacklilng it head on -
>> which is great news ! keep up the good work !
>>
>> On Fri, Jun 13, 2008 at 10:46 AM, Randall Hauch <rhauch at redhat.com> wrote:
>>>
>>> DNA is specially addressing this in a couple of ways.  First, the core
>>> interfaces inside the federation engine address this by using iterators,
>>> and
>>> providing certain guarantees about using those iterators (e.g., within
>>> the
>>> same transaction).  This means that the federation engine will not
>>> require
>>> all children to be in memory.  We may also provide more optimized
>>> behaviors,
>>> such as an explicit paging mechanism for accessing the children of a
>>> node.
>>>
>>> Second, we'll be developing a connector
>>> (http://jira.jboss.org/jira/browse/DNA-40) that stores its information in
>>> a
>>> relational database, and this will not be storing all children in one
>>> chunk.
>>> (Jackrabbit does this, so if you add a 1001st child node, the whole list
>>> of
>>> 1001 children has to be written to the store.  Unfortunately for them,
>>> this
>>> is designed into the foundation for their persistence layer.)
>>>
>>> On Jun 12, 2008, at 6:45 PM, Michael Neale wrote:
>>>
>>>> I guess I am thinking of the case of a large volume of small
>>>> "children" nodes - specifically. As you say, it can be a problem -
>>>> will DNA help with this at all?
>>>>
>>>> On Fri, Jun 13, 2008 at 3:58 AM, Randall Hauch <rhauch at redhat.com>
>>>> wrote:
>>>>>
>>>>> You're right that JCR handles heterogeneous data better than almost
>>>>> anything
>>>>> else, especially when the information structure changes/evolves over
>>>>> time.
>>>>>
>>>>> And I thought the InfoQ architecture was brilliant - use multiple
>>>>> independent JCRs for infrequently changing data, eliminating the need
>>>>> to
>>>>> create/maintain/scale a cluster.  Very elegant and simple solution.
>>>>>  And
>>>>> in
>>>>> this particular case, it doesn't really matter if there is a slight
>>>>> difference in content among the different machines during the push of
>>>>> new
>>>>> information to the independent repositories.
>>>>>
>>>>> But can you elaborate on your thought that JCR might not be useful for
>>>>> transactional data?
>>>>>
>>>>> IMO, JCR is useful in a lot of situations, and of course it is limited
>>>>> in
>>>>> others.  Right now, the implementations don't do clustering or very
>>>>> large
>>>>> repositories well.  Most impls also seem to be limited in the efficient
>>>>> handling of large numbers of children for any given node.
>>>>>  Incorporation
>>>>> of
>>>>> information outside of JCR is also difficult, as it has to be done
>>>>> above
>>>>> JCR
>>>>> - although DNA will change this.  But I'm not sure that frequently
>>>>> changing
>>>>> data is universally a limitation.  Perhaps frequent additions of large
>>>>> volumes of data are a problem because you quickly get to volumes of
>>>>> data
>>>>> that are too large.  Or frequent changes to data may be a problem if
>>>>> versioning is used, as it could quickly lead to unusable numbers of
>>>>> versions.
>>>>>
>>>>>
>>>>> On Jun 10, 2008, at 9:45 PM, Michael Neale wrote:
>>>>>
>>>>>> JCR seems to have a lot of traction I have noticed. Certainly seems to
>>>>>> be the default choice now for heterogenous data. And data is
>>>>>> increasingly heterogenous.
>>>>>>
>>>>>> I guess my only thoughts on it are its limitations: should JCR *not*
>>>>>> be used for transactional data - ie feeds of incoming data that change
>>>>>> often?
>>>>>>
>>>>>> On Wed, Jun 11, 2008 at 12:11 PM, Randall Hauch <rhauch at redhat.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> There have been a couple of recent articles on InfoQ about JCR and/or
>>>>>>> REST.
>>>>>>> In case you haven't seen them, they're all worth a good read.
>>>>>>>
>>>>>>> Interview with David Nuescheler, from Day Software:
>>>>>>> http://www.infoq.com/articles/nuescheler-jcr-rest
>>>>>>> InfoQ architecture and use of JCR:
>>>>>>> http://www.infoq.com/presentations/design-and-architecture-of-infoq
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Randall
>>>>>>> _______________________________________________
>>>>>>> dna-dev mailing list
>>>>>>> dna-dev at lists.jboss.org
>>>>>>> https://lists.jboss.org/mailman/listinfo/dna-dev
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Michael D Neale
>>>>>> home: www.michaelneale.net
>>>>>> blog: michaelneale.blogspot.com
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Michael D Neale
>>>> home: www.michaelneale.net
>>>> blog: michaelneale.blogspot.com
>>>
>>>
>>
>>
>>
>> --
>> Michael D Neale
>> home: www.michaelneale.net
>> blog: michaelneale.blogspot.com
>
>

-- 
Michael D Neale
home: www.michaelneale.net
blog: michaelneale.blogspot.com