some simple questions for Search dev
by Sanne Grinovero
Hello,
I need some suggestion to name new parameters in H.Search;
I'm implementing the parameter for FSMasterDirectoryProvider and
FSSlaveDirectoryProvider to select an appropriate "chunk size":
the amount of bytes to have java.nio attempt to transfer at once;
(we have to limit it as huge files blow it: 2GB on linux,
100MB on windows 32bit, don't know on windows 64bit)
I defaulted to a very conservative 16MB (ok?)
and the patch is really trivial but:
A)I need a good and intuitive name for the parameter:
I have "copyChunkSizeMB" currently in my code,
I don't like it but couldn't think of a better alternative.
The result would be "hibernate.search.[indexname].copyChunkSizeMB"...
something better that gives the idea?
I thought of "blockSize" but that isn't really a block size, but gives the idea
as people could be more familiar with it. "copyBlockSizeMB"?
I'll have to document the chosen name, so I think it won't
be easy to change it later.
B)Units for the previous param?
chosing bytes is not very practical: final user would
certainly need to open his calcluator;
MB is probably the choice but means people can't select chunks below the MB,
which would be silly anyway IMHO.
C)The "smart" parameter: current FileHelper has a really
cool feature called "smart" which is currently always enabled,
it does something similar to rsync checking timestamps & file
sizes to skip copying some files.
there's a TODO in both Master&Slave Dir.Providers to make it
a configurable parameter (which means having the option
to disable it).
I just need you to confirm the TODO is still valid.
Name? "enableSmartCopy" ? (default true)
thanks,
Sanne
16 years, 8 months
Problem setting the package name using Hibernate Tools.
by Sri Gowri
Hi,
I am experiencing a problem with the package name declaration during pojo and mapping generation with Hibernate Tools without using Ant.
hibernate.cfg.xml:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE hibernate-configuration PUBLIC "-//Hibernate/Hibernate
Configuration DTD 3.0//EN"
"http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd">
<hibernate-configuration>
<session-factory name="session1">
<property name="hibernate.dialect">org.hibernate.dialect.DerbyDialect</property>
<property name="hibernate.connection.driver_class">org.apache.derby.jdbc.ClientDriver</property>
<property name="hibernate.connection.url">jdbc:derby://localhost:1527/travel</property>
<property name="hibernate.connection.username">travel</property>
<property name="hibernate.connection.password">travel</property>
</session-factory>
</hibernate-configuration>
hibernate.reveng.xml:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE hibernate-reverse-engineering PUBLIC
"-//Hibernate/Hibernate Reverse Engineering DTD 3.0//EN"
"http://hibernate.sourceforge.net/hibernate-reverse-engineering-3.0.dtd">
<hibernate-reverse-engineering>
<table-filter exclude="false" match-catalog=".*" match-name="FLIGHT" match-schema=".*"/>
<table-filter exclude="false" match-catalog=".*" match-name="PERSON" match-schema=".*"/>
<table-filter exclude="false" match-catalog=".*" match-name="TRIP" match-schema=".*"/>
<table-filter exclude="false" match-catalog=".*" match-name="TRIPTYPE" match-schema=".*"/>
</hibernate-reverse-engineering>
Code that generates pojos and mapping files:
try {
cfg = new JDBCMetaDataConfiguration();
OverrideRepository or = new OverrideRepository();
InputStream xmlInputStream = new FileInputStream(FileUtil.toFile(revengFile));
xmlHelper = new XMLHelper();
entityResolver = XMLHelper.DEFAULT_DTD_RESOLVER;
List errors = new ArrayList();
SAXReader saxReader = xmlHelper.createSAXReader("XML InputStream", errors, entityResolver);
org.dom4j.Document doc = saxReader.read(new InputSource(xmlInputStream));
Configuration c = cfg.configure(confFile);
cfg.setReverseEngineeringStrategy(or.getReverseEngineeringStrategy(new DefaultReverseEngineeringStrategy()));
cfg.readFromJDBC();
} catch (Exception e) {
Exceptions.printStackTrace(e);
}
// Generating POJOs
FileObject pkg;
try {
pkg = SourceGroups.getFolderForPackage(helper.getLocation(), helper.getPackageName());
File outputDir = FileUtil.toFile(pkg);
POJOExporter exporter = new POJOExporter(cfg, outputDir);
exporter.getProperties().setProperty("jdk", new Boolean(helper.getJavaSyntax()).toString());
exporter.getProperties().setProperty("ejb3", new Boolean(helper.getEjbAnnotation()).toString());
exporter.start();
} catch (IOException ex) {
Exceptions.printStackTrace(ex);
}
// Generate Mappings
try {
pkg = SourceGroups.getFolderForPackage(helper.getLocation(), helper.getPackageName());
File outputDir = FileUtil.toFile(pkg);
HibernateMappingExporter exporter = new HibernateMappingExporter(cfg, outputDir);
exporter.start();
} catch (Exception e) {
Exceptions.printStackTrace(ex);
}
in the code outputDir is
C:\Documents and Settings\gowri\MyDocuments\NetBeansProjects\WebApplication57\src\java\Travel.
But the generated pojo and mapping files don't contain package as Travel .
// default package
// Generated May 27, 2008 12:45:56 AM by Hibernate Tools 3.2.1.GA
import java.util.Date;
/**
* Person generated by hbm2java
*/
public class Person implements java.io.Serializable {
private int personid;
private String name;
private String jobtitle;
private Short frequentflyer;
private Date lastupdated;
public Person() {
}
public Person(int personid) {
this.personid = personid;
}
public Person(int personid, String name, String jobtitle, Short frequentflyer, Date lastupdated) {
this.personid = personid;
this.name = name;
this.jobtitle = jobtitle;
this.frequentflyer = frequentflyer;
this.lastupdated = lastupdated;
}
<?xml version="1.0"?>
<!DOCTYPE hibernate-mapping PUBLIC
"-//Hibernate/Hibernate Mapping DTD 3.0//EN"
"http://hibernate.sourceforge.net/hibernate-mapping-3.0.dtd">
<!-- Generated May 27, 2008 12:45:57 AM by Hibernate Tools 3.2.1.GA -->
<hibernate-mapping>
<class name="Person" table="PERSON" schema="TRAVEL">
<id name="personid" type="int">
class name should be Travel.Person.
Wondering why tools is not setting the package right ?
Secondly, even though I have defined the HibernateReverseEngineeringStrategy, why tools is generating pojos and mapping files for all the tables in the db ? (Note:I have listed only selected tables in the hibernate.reveng.xml).
16 years, 8 months
Re: Changes to Hibernate classes
by Manik Surtani
cc'ing hibernate-dev.
On 2 Jun 2008, at 23:57, Emmanuel Bernard wrote:
> Those discussions should really go to hibernate-dev(a)lists.jboss.org.
> No need to hide that :)
>
> On Jun 2, 2008, at 12:46, Navin Surtani wrote:
>
>> A couple more things have cropped up: -
>>
>> * Where do you create and register FullTextIndexEventListeners?
>
> It's either done explicitly by the user, or done by Hibernate
> Annotations
> http://www.hibernate.org/hib_docs/search/reference/en/html_single/#d0e880
>
>>
>> * How can the SearchFactoryImpl class /lazily /update the document
>> builder map in that same class? Would you suggest a subclass ?? In
>> JBC I don't know before-hand as to what types are going to be
>> indexed, so this will have to be done lazily.
>
> This is quite problematic as a lot of the concurrency scheme suppose
> we build all the metadata upfront in a concurrent-safe way.
> I think we discussed that earlier and we agreed to pass a list of
> expected classes up front for now.
Yes, of course. We could use a classpath scanner, to scan for
annotations, but that can be problematic/unreliable/slow? I guess
explicit declaration is the way to go then.
Regarding how this is bootstrapped, I was thinking about publishing
this as a separate "edition" of JBoss Cache. This project, lets call
it jbosscache-searchable for now, would have a dependency on
jbosscache-core and hibernate-search. The central class could be a
org.jboss.cache.search.SearchableCache interface, which is a sub-
interface of org.jboss.cache.Cache, adding a single method:
List<Object> find(Query q);
One would create a SearchableCache using a SearchableCacheFactory,
which would expose a single method:
SearchableCache createSearchableCache(Cache underlyingCache, Class...
typesToIndex);
This method could then initialize and set up Hibernate Search
internals including document builder maps, etc., attach cache
listeners, and provide a proxy which proxies all normal cache methods
to the underlying cache, and handles the find() method the way
FullTextSession handles it.
WDYT?
>> * How can you pass a documentId directly to a Worker? The types
>> being indexed will not have the documentId annotation - this will
>> have to be generated based on Fqn and key. What type does this
>> have to be - can it be an Object?
>
> It can be an object provided that the final id (the FQN + key) can
> be transformed into a unique String.
> The String can then be passed to Work when building it.
Yes, this can be done. We could create a wrapper class that is able
to convert between the two representations.
> To get DocumentBuilder work properly, we will need to adjust it to
> pickup a class level @ProvidedId (more likely a different annotation).
>
> @Indexed
> @ProvidedId(name="id", //field name
> bridge= @FieldBridge(...),
> boost=@Boost() )
> public class MyJBCCachedObject {
> }
>
> @ProvidedId means that we won't be able to use @ContainedIn probably
> (as we don't have a getter). We can think about that later.
>
I'm guessing this is a marker to inform the DocumentBuilder not to
expect a @DocumentId field, and with information on how to fetch a
document id to be used in it's place?
Cheers,
--
Manik Surtani
Lead, JBoss Cache
manik(a)jboss.org
16 years, 8 months
Search: some changes to new perf tests
by Sanne Grinovero
Hi Emmanuel,
I've done some changes to your recent new test about Search performance,
I first introduced a common "start signal" to all threads to ensure I
was testing
the concurrency, but wen I did and after you fixed the latest bug (HSEARCH-204)
it appeared that Search was performing approximately twice as fast as
raw Lucene, quite a suspect behaviour ;-)
So I found that the Lucene test was iterating
on results from 0 to <=100, not 0 to <100, so actually fetching more
than 100 results (101). As Lucene collects the data in it's Hits
at batches of 100 elements, this introduced a x2 overhead for the
pure Lucene tests.
(we could exploit this for .setMaxResults() )
Besides that I introduced some other minor changes:
A)Synchronization to read the timings (it doesn't interfere wit timings).
B)Use of an Executor
C)A start signal (a simple CountDownLatch)
D)I had to disable in-thread logging and enable
multiple iterations per-thread as it completed too fast
to provide a reliable reading of numbers;
It's still quite variable (gc?), but gives an idea to compare
Lucene to Search.
approval to commit?
Sanne
16 years, 8 months