[hibernate-dev] Re: Hibernate Lucene - forum issue
Ales Justin
ales.justin at gmail.com
Mon Oct 16 07:23:25 EDT 2006
>> What is DynamicBoost?
I use it for computing boost at runtime - not just a static boost number.
Since my boost can depend on current state of different resources.
For example - I need my articles that are new, pushed higher than the
old one's.
public @interface DynamicBoost {
float coef() default 1.0f;
String transformer() default "number";
String limit() default "identity";
String formula() default "quotient";
String query() default "";
boolean isHQL() default true;
}
public Float getBoost(Object value, DynamicBoost db) {
Transformer t = transformers.get(db.transformer());
double fd = t.toDouble(value);
Limitizer limitizer = limitizers.get(db.limit());
double limit = limitizer.limit(value, db, t);
BoostFormula formula = formulas.get(db.formula());
return formula.calculate(db, fd, limit);
}
>> I fixed the concurrency issues in the Lucene event a while back
using a reentrant lock per DirectoryProvider.
:-)
Doing the same thing.
public class SynchLuceneEventListener implements
PostDeleteEventListener, PostInsertEventListener,
PostUpdateEventListener, Initializable {
private Map<String, ExtendedDocumentBuilder> documentBuilders = new
HashMap<String, ExtendedDocumentBuilder>();
private Map<File, IndexWriter> tmpWriterMap = new TreeMap<File,
IndexWriter>();
private boolean initialized;
protected final Log log = LogFactory.getLog(getClass());
/**
* Using JNDI to access DynamicBooster instance.
* Can have problems with serializability.
* @see JBossSynchLuceneEventListener for MBean implementation
*/
protected DynamicBooster lookupDynamicBooster(Properties properties)
throws Exception {
String lookupName = DynamicBooster.JNDI_NAME;
String boosterJndiName =
properties.getProperty(DynamicBooster.PROPERTY_JNDI_NAME);
if (boosterJndiName != null) {
lookupName = boosterJndiName;
}
Context jndiContext = NamingHelper.getInitialContext(properties);
return (DynamicBooster)jndiContext.lookup(lookupName);
}
/**
* No need to synchronize, since this is the only intialized once.
* When used in multiple .par files, they are initialized sequentially.
*/
public void initialize(Configuration cfg) {
if (initialized) return;
try {
final Properties properties = cfg.getProperties();
DirectoryLocker directoryLocker = DirectoryLocker.getInstance();
directoryLocker.initialize(properties);
DynamicBooster dynamicBooster =
lookupDynamicBooster(properties);
Iterator iter = cfg.getClassMappings();
while (iter.hasNext()) {
PersistentClass pc = (PersistentClass)iter.next();
Class mappedClass = pc.getMappedClass();
if (mappedClass != null) {
if (mappedClass.getAnnotation(Indexed.class) != null) {
String entityName = pc.getEntityName();
final ExtendedDocumentBuilder documentBuilder =
new ExtendedDocumentBuilder(entityName, mappedClass);
documentBuilder.setDynamicBooster(dynamicBooster);
documentBuilders.put(entityName, documentBuilder);
File file = documentBuilder.getFile();
try {
IndexWriter iw = tmpWriterMap.get(file);
if (iw == null) {
boolean create = !file.exists();
SynchDirectory sdir =
DirectoryLocker.getInstance().getDirectory(file, create);
try {
iw = new IndexWriter(sdir,
documentBuilder.getAnalyzer(), create);
} finally {
sdir.unlock();
}
tmpWriterMap.put(file, iw);
}
} catch (IOException ioe) {
throw new HibernateException(ioe);
}
log.info("index: " +
documentBuilder.getFile().getAbsolutePath() + " - " + entityName);
}
}
}
initialized = true;
} catch (Exception e) {
throw new HibernateException(e);
} finally {
for (IndexWriter iw : tmpWriterMap.values()) {
LuceneUtils.close(iw);
}
tmpWriterMap.clear();
tmpWriterMap = null;
}
}
public void onPostInsert(PostInsertEvent event) {
final Object entity = event.getEntity();
ExtendedDocumentBuilder builder =
documentBuilders.get(event.getPersister().getEntityName());
if (builder != null) {
add(entity, builder, event.getId());
}
}
public void onPostUpdate(PostUpdateEvent event) {
final Object entity = event.getEntity();
ExtendedDocumentBuilder builder =
documentBuilders.get(event.getPersister().getEntityName());
if (builder != null) {
final Serializable id = event.getId();
remove(builder, id);
add(entity, builder, id);
}
}
public void onPostDelete(PostDeleteEvent event) {
ExtendedDocumentBuilder builder =
documentBuilders.get(event.getPersister().getEntityName());
if (builder != null) {
remove(builder, event.getId());
}
}
private void remove(ExtendedDocumentBuilder builder, Serializable id) {
Term idTerm = builder.getTerm(id);
File file = builder.getFile();
log.info("removing: " + idTerm + ", " + file);
try {
IndexReader reader = null;
SynchDirectory sdir =
DirectoryLocker.getInstance().getDirectory(file);
try {
reader = IndexReader.open(sdir);
reader.delete(idTerm);
} finally {
LuceneUtils.close(reader);
sdir.unlock();
}
} catch (IOException ioe) {
throw new HibernateException(ioe);
}
}
private void add(final Object entity, final ExtendedDocumentBuilder
builder, final Serializable id) {
Document doc = builder.getDocument(entity, id);
File file = builder.getFile();
Analyzer analyzer = builder.getAnalyzer();
log.info("adding: " + doc + ", " + file + ", " +
analyzer.getClass().getName());
try {
IndexWriter writer = null;
SynchDirectory sdir =
DirectoryLocker.getInstance().getDirectory(file);
try {
writer = new IndexWriter(sdir, analyzer, false);
writer.addDocument(doc);
} finally {
LuceneUtils.close(writer);
sdir.unlock();
}
} catch (IOException ioe) {
throw new HibernateException(ioe);
}
}
}
public class SynchDirectory extends Directory {
private static final Log log = LogFactory.getLog(SynchDirectory.class);
private Directory directory;
private String name;
private java.util.concurrent.locks.Lock lock;
public SynchDirectory(Directory directory) {
this.directory = directory;
lock = new ReentrantLock();
}
void lock(String name) {
this.name = name;
log.debug("Locking synch directory: " + name);
lock.lock();
}
public void unlock() {
lock.unlock();
log.debug("Unlocking synch directory: " + name);
this.name = null;
}
public String[] list() throws IOException {
return directory.list();
}
public boolean fileExists(String name) throws IOException {
return directory.fileExists(name);
}
public long fileModified(String name) throws IOException {
return directory.fileModified(name);
}
public void touchFile(String name) throws IOException {
directory.touchFile(name);
}
public void deleteFile(String name) throws IOException {
directory.deleteFile(name);
}
public void renameFile(String from, String to) throws IOException {
directory.renameFile(from, to);
}
public long fileLength(String name) throws IOException {
return directory.fileLength(name);
}
public OutputStream createFile(String name) throws IOException {
return directory.createFile(name);
}
public InputStream openFile(String name) throws IOException {
return directory.openFile(name);
}
public Lock makeLock(String name) {
return directory.makeLock(name);
}
public void close() throws IOException {
directory.close();
}
}
>> What is additional POJO handling?
I'm using predetermined HQL to select only Lucene indexable properties,
associations - minimizing the stuff that is pulled out when running a
full new indexation.
public @interface LuceneReadQuery {
String value();
}
@LuceneReadQuery("select new Article(a.id, a.title, a.body, a.intro,
a.subtitle, a.source.name, a.category.id, a.publishDate, a.locale) from
Article a")
I also need different Analyzer instances for different POJOs - some are
localizable, some are plain english text, some are just a bunch of
numbers, ...
public @interface Analyzer {
Class<? extends org.apache.lucene.analysis.Analyzer> analyzerClass();
}
I had some issues when different pojos where indexed in different files
- I needed them to be in the same one. Ok, you can do this by setting
index attribute.
But I needed to handle identity stuff - so that based on given result I
could pull out the right pojo from Session.
doc.add(Field.Keyword(ENTITY_FIELD_NAME, entityName));
doc.add(Field.Keyword(IDENTITY_FIELD_NAME,
createIdentityString(id)));
private String createIdentityString(Serializable id) {
return (entityName + "#" + id);
}
//possible identifying fields
String entityName =
doc.get(ExtendedDocumentBuilder.ENTITY_FIELD_NAME);
if (entityName != null) {
// todo - currently expecting only integer id's
Object entity = session.get(
entityName,
Integer.parseInt(doc.get(ExtendedDocumentBuilder.ID_FIELD_NAME))
);
>> PS I remember an old JIRA issue of you talking about a filter
capability. I thing it's a nice idea, would probably make sense after
the core work is stable.
Yep.
Having @Permission annotation - used the following way:
Permission p = currClass.getAnnotation(Permission.class);
if (p != null) {
permissions.add(p.permission());
}
Permissions ps = currClass.getAnnotation(Permissions.class);
if (ps != null) {
permissions.addAll(Arrays.asList(ps.permission()));
}
if (!permissions.isEmpty()) {
doc.add(Field.UnStored(
PERMISSION_FIELD_NAME,
StringHelper.join(",", permissions.iterator()))
);
}
And this filter takes care of applying permissions to Lucene search.
public class SinglePermissionFilter extends Filter {
private String permission;
public SinglePermissionFilter() {
}
public SinglePermissionFilter(String permission) {
this.permission = permission;
}
public BitSet bits(IndexReader reader) throws IOException {
BitSet bits = new BitSet(reader.maxDoc());
String fieldName = ExtendedDocumentBuilder.PERMISSION_FIELD_NAME;
TermDocs td = reader.termDocs(new Term(fieldName, permission));
while(td.next()) {
bits.set(td.doc());
}
return bits;
}
public String getPermission() {
return permission;
}
public void setPermission(String permission) {
this.permission = permission;
}
}
Executing search with permission filter:
Hits hits = searcher.search(sqh.getQuery(), permissionFilter);
Ok, if I remember some other issues that I had - will email them. :-)
Rgds, Ales
Emmanuel Bernard wrote:
> Hi Ales
>
> Yes let's move that to the mailing list
>
> What is DynamicBoost?
> I've added support for @Boost on both attributes and entity
> I have also introduced a Bridge notion that do the translation between
> the property and the Lucene field. I'm almost done with that one. This
> is very much like the Hibernate Type in it's flexibility and I'll add
> some specific annotations support for numeric padding and date resolution
>
> I fixed the concurrency issues in the Lucene event a while back using
> a reentrant lock per DirectoryProvider. Did you check the code in
> http://anonsvn.jboss.org/repos/hibernate/branches/Lucene_Integration/
> I'm interested in seeing some additional issue if any.
>
> What is additional POJO handling?
>
> PS I remember an old JIRA issue of you talking about a filter
> capability. I thing it's a nice idea, would probably make sense after
> the core work is stable.
>
> Ales Justin wrote:
>> Hey,
>>
>> I extended the initial usage quite a lot - needed some additional stuff.
>> Some additional annotations - Boost, DynamicBoost, DateField, ...
>> Rewritten current LuceneListener - had problems with concurrency,
>> additional pojo handling, ...
>>
>> I can sum up some stuff (non customized stuff) and send it - if
>> you're interested.
>> If so - to you or to the mailing list?
>>
>> Rgds, Ales
>>
>> ps: coming to JBW Berlin?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/hibernate-dev/attachments/20061016/22cbb75a/attachment.html
More information about the hibernate-dev
mailing list