[hibernate-dev] Re: Hibernate Lucene - forum issue

Ales Justin ales.justin at gmail.com
Mon Oct 16 07:23:25 EDT 2006


>> What is DynamicBoost?


I use it for computing boost at runtime - not just a static boost number.
Since my boost can depend on current state of different resources.

For example - I need my articles that are new, pushed higher than the
old one's.

public @interface DynamicBoost {

   float coef() default 1.0f;

   String transformer() default "number";

   String limit() default "identity";

   String formula() default "quotient";

   String query() default "";

   boolean isHQL() default true;

}

   public Float getBoost(Object value, DynamicBoost db) {
       Transformer t = transformers.get(db.transformer());
       double fd = t.toDouble(value);
       Limitizer limitizer = limitizers.get(db.limit());
       double limit = limitizer.limit(value, db, t);
       BoostFormula formula = formulas.get(db.formula());
       return formula.calculate(db, fd, limit);
   }

>> I fixed the concurrency issues in the Lucene event a while back
using a reentrant lock per DirectoryProvider.

:-)
Doing the same thing.

public class SynchLuceneEventListener implements
PostDeleteEventListener, PostInsertEventListener,
       PostUpdateEventListener, Initializable {

   private Map<String, ExtendedDocumentBuilder> documentBuilders = new
HashMap<String, ExtendedDocumentBuilder>();
   private Map<File, IndexWriter> tmpWriterMap = new TreeMap<File,
IndexWriter>();

   private boolean initialized;

   protected final Log log = LogFactory.getLog(getClass());

   /**
    * Using JNDI to access DynamicBooster instance.
    * Can have problems with serializability.
    * @see JBossSynchLuceneEventListener for MBean implementation
    */
   protected DynamicBooster lookupDynamicBooster(Properties properties)
throws Exception {
       String lookupName = DynamicBooster.JNDI_NAME;
       String boosterJndiName =
properties.getProperty(DynamicBooster.PROPERTY_JNDI_NAME);
       if (boosterJndiName != null) {
           lookupName = boosterJndiName;
       }
       Context jndiContext = NamingHelper.getInitialContext(properties);
       return (DynamicBooster)jndiContext.lookup(lookupName);
   }

   /**
    * No need to synchronize, since this is the only intialized once.
    * When used in multiple .par files, they are initialized sequentially.
    */
   public void initialize(Configuration cfg) {
       if (initialized) return;

       try {
           final Properties properties = cfg.getProperties();
           DirectoryLocker directoryLocker = DirectoryLocker.getInstance();
           directoryLocker.initialize(properties);
           DynamicBooster dynamicBooster =
lookupDynamicBooster(properties);

           Iterator iter = cfg.getClassMappings();
           while (iter.hasNext()) {
               PersistentClass pc = (PersistentClass)iter.next();
               Class mappedClass = pc.getMappedClass();
               if (mappedClass != null) {
                   if (mappedClass.getAnnotation(Indexed.class) != null) {
                       String entityName = pc.getEntityName();
                       final ExtendedDocumentBuilder documentBuilder =
new ExtendedDocumentBuilder(entityName, mappedClass);
                       documentBuilder.setDynamicBooster(dynamicBooster);
                       documentBuilders.put(entityName, documentBuilder);
                       File file = documentBuilder.getFile();
                       try {
                           IndexWriter iw = tmpWriterMap.get(file);
                           if (iw == null) {
                               boolean create = !file.exists();
                               SynchDirectory sdir =
DirectoryLocker.getInstance().getDirectory(file, create);
                               try {
                                   iw = new IndexWriter(sdir,
documentBuilder.getAnalyzer(), create);
                               } finally {
                                   sdir.unlock();
                               }
                               tmpWriterMap.put(file, iw);
                           }
                       } catch (IOException ioe) {
                           throw new HibernateException(ioe);
                       }
                       log.info("index: " +
documentBuilder.getFile().getAbsolutePath() + " - " + entityName);
                   }
               }
           }
           initialized = true;
       } catch (Exception e) {
           throw new HibernateException(e);
       } finally {
           for (IndexWriter iw : tmpWriterMap.values()) {
               LuceneUtils.close(iw);
           }
           tmpWriterMap.clear();
           tmpWriterMap = null;
       }
   }

   public void onPostInsert(PostInsertEvent event) {
       final Object entity = event.getEntity();
       ExtendedDocumentBuilder builder =
documentBuilders.get(event.getPersister().getEntityName());
       if (builder != null) {
           add(entity, builder, event.getId());
       }
   }

   public void onPostUpdate(PostUpdateEvent event) {
       final Object entity = event.getEntity();
       ExtendedDocumentBuilder builder =
documentBuilders.get(event.getPersister().getEntityName());
       if (builder != null) {
           final Serializable id = event.getId();
           remove(builder, id);
           add(entity, builder, id);
       }
   }

   public void onPostDelete(PostDeleteEvent event) {
       ExtendedDocumentBuilder builder =
documentBuilders.get(event.getPersister().getEntityName());
       if (builder != null) {
           remove(builder, event.getId());
       }
   }

   private void remove(ExtendedDocumentBuilder builder, Serializable id) {
       Term idTerm = builder.getTerm(id);
       File file = builder.getFile();
       log.info("removing: " + idTerm + ", " + file);
       try {
           IndexReader reader = null;
           SynchDirectory sdir =
DirectoryLocker.getInstance().getDirectory(file);
           try {
               reader = IndexReader.open(sdir);
               reader.delete(idTerm);
           } finally {
               LuceneUtils.close(reader);
               sdir.unlock();
           }
       } catch (IOException ioe) {
           throw new HibernateException(ioe);
       }
   }

   private void add(final Object entity, final ExtendedDocumentBuilder
builder, final Serializable id) {
       Document doc = builder.getDocument(entity, id);
       File file = builder.getFile();
       Analyzer analyzer = builder.getAnalyzer();
       log.info("adding: " + doc + ", " + file + ", " +
analyzer.getClass().getName());
       try {
           IndexWriter writer = null;
           SynchDirectory sdir =
DirectoryLocker.getInstance().getDirectory(file);
           try {
               writer = new IndexWriter(sdir, analyzer, false);
               writer.addDocument(doc);
           } finally {
               LuceneUtils.close(writer);
               sdir.unlock();
           }
       } catch (IOException ioe) {
           throw new HibernateException(ioe);
       }
   }

}

public class SynchDirectory extends Directory {

   private static final Log log = LogFactory.getLog(SynchDirectory.class);
   private Directory directory;
   private String name;
   private java.util.concurrent.locks.Lock lock;

   public SynchDirectory(Directory directory) {
       this.directory = directory;
       lock = new ReentrantLock();
   }

   void lock(String name) {
       this.name = name;
       log.debug("Locking synch directory: " + name);
       lock.lock();
   }

   public void unlock() {
       lock.unlock();
       log.debug("Unlocking synch directory: " + name);
       this.name = null;
   }

   public String[] list() throws IOException {
       return directory.list();
   }

   public boolean fileExists(String name) throws IOException {
       return directory.fileExists(name);
   }

   public long fileModified(String name) throws IOException {
       return directory.fileModified(name);
   }

   public void touchFile(String name) throws IOException {
       directory.touchFile(name);
   }

   public void deleteFile(String name) throws IOException {
       directory.deleteFile(name);
   }

   public void renameFile(String from, String to) throws IOException {
       directory.renameFile(from, to);
   }

   public long fileLength(String name) throws IOException {
       return directory.fileLength(name);
   }

   public OutputStream createFile(String name) throws IOException {
       return directory.createFile(name);
   }

   public InputStream openFile(String name) throws IOException {
       return directory.openFile(name);
   }

   public Lock makeLock(String name) {
       return directory.makeLock(name);
   }

   public void close() throws IOException {
       directory.close();
   }

}

>> What is additional POJO handling?

I'm using predetermined HQL to select only Lucene indexable properties,
associations - minimizing the stuff that is pulled out when running a
full new indexation.

public @interface LuceneReadQuery {

   String value();

}

@LuceneReadQuery("select new Article(a.id, a.title, a.body, a.intro,
a.subtitle, a.source.name, a.category.id, a.publishDate, a.locale) from
Article a")


I also need different Analyzer instances for different POJOs - some are
localizable, some are plain english text, some are just a bunch of
numbers, ...

public @interface Analyzer {

   Class<? extends org.apache.lucene.analysis.Analyzer> analyzerClass();

}

I had some issues when different pojos where indexed in different files
- I needed them to be in the same one. Ok, you can do this by setting
index attribute.
But I needed to handle identity stuff - so that based on given result I
could pull out the right pojo from Session.

           doc.add(Field.Keyword(ENTITY_FIELD_NAME, entityName));
           doc.add(Field.Keyword(IDENTITY_FIELD_NAME,
createIdentityString(id)));

   private String createIdentityString(Serializable id) {
       return (entityName + "#" + id);
   }

                   //possible identifying fields
                   String entityName =
doc.get(ExtendedDocumentBuilder.ENTITY_FIELD_NAME);

                   if (entityName != null) {
                       // todo - currently expecting only integer id's
                       Object entity = session.get(
                               entityName,

Integer.parseInt(doc.get(ExtendedDocumentBuilder.ID_FIELD_NAME))
                       );


>> PS I remember an old JIRA issue of you talking about a filter
capability. I thing it's a nice idea, would probably make sense after
the core work is stable.

Yep.
Having @Permission annotation - used the following way:

           Permission p = currClass.getAnnotation(Permission.class);
           if (p != null) {
               permissions.add(p.permission());
           }

           Permissions ps = currClass.getAnnotation(Permissions.class);
           if (ps != null) {
               permissions.addAll(Arrays.asList(ps.permission()));
           }


       if (!permissions.isEmpty()) {
           doc.add(Field.UnStored(
                   PERMISSION_FIELD_NAME,
                   StringHelper.join(",", permissions.iterator()))
           );
       }

And this filter takes care of applying permissions to Lucene search.

public class SinglePermissionFilter extends Filter {

   private String permission;

   public SinglePermissionFilter() {
   }

   public SinglePermissionFilter(String permission) {
       this.permission = permission;
   }

   public BitSet bits(IndexReader reader) throws IOException {
       BitSet bits = new BitSet(reader.maxDoc());
       String fieldName = ExtendedDocumentBuilder.PERMISSION_FIELD_NAME;
       TermDocs td = reader.termDocs(new Term(fieldName, permission));
       while(td.next()) {
           bits.set(td.doc());
       }
       return bits;
   }

   public String getPermission() {
       return permission;
   }

   public void setPermission(String permission) {
       this.permission = permission;
   }

}

Executing search with permission filter:

           Hits hits = searcher.search(sqh.getQuery(), permissionFilter);


Ok, if I remember some other issues that I had - will email them. :-)


Rgds, Ales


Emmanuel Bernard wrote:
> Hi Ales
>
> Yes let's move that to the mailing list
>
> What is DynamicBoost?
> I've added support for @Boost on both attributes and entity
> I have also introduced a Bridge notion that do the translation between
> the property and the Lucene field. I'm almost done with that one. This
> is very much like the Hibernate Type in it's flexibility and I'll add
> some specific annotations support for numeric padding and date resolution
>
> I fixed the concurrency issues in the Lucene event a while back using
> a reentrant lock per DirectoryProvider. Did you check the code in
> http://anonsvn.jboss.org/repos/hibernate/branches/Lucene_Integration/
> I'm interested in seeing some additional issue if any.
>
> What is additional POJO handling?
>
> PS I remember an old JIRA issue of you talking about a filter
> capability. I thing it's a nice idea, would probably make sense after
> the core work is stable.
>
> Ales Justin wrote:
>> Hey,
>>
>> I extended the initial usage quite a lot - needed some additional stuff.
>> Some additional annotations - Boost, DynamicBoost, DateField, ...
>> Rewritten current LuceneListener - had problems with concurrency,
>> additional pojo handling, ...
>>
>> I can sum up some stuff (non customized stuff) and send it - if
>> you're interested.
>> If so - to you or to the mailing list?
>>
>> Rgds, Ales
>>
>> ps: coming to JBW Berlin?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/hibernate-dev/attachments/20061016/22cbb75a/attachment.html 


More information about the hibernate-dev mailing list