SealedQueryBuilder qb = searchFactory.withEntityAnalyzer(Address.class);

Query luceneQuery =

qb.must(Occurs.MUST)

.add(

qb.boolean(Occurs.Should)

.add( qb.term("city", "Atlanta").boostedTo(4).createQuery() )

.add( qb.term("address1", "Peachtree").fuzzy().threshold(.7).createQuery() )

)

.add(

qb.from("movingDate", "200604").to("201201").exclusive().createQuery()

)

.createQuery();

BooleanQuery luceneQuery = new BooleanQuery();

BooleanQuery addressLocationQuery = new BooleanQuery();

Query city = new TermQuery( new Term("city", "Atlanta") );

city.setBoost(4f);

addressLocationQuery.add(BooleanClause.Occur.Should, city);

Query address1 = new FuzzyQuery( new Term("address1", "Peachtree"), .7 );

addressLocationQuery.add(BooleanClause.Occur.Should, address1);

luceneQuery.add(BooleanClause.Occur.Must, addressLocationQuery);

Query range = new RangeQuery( new Term("movingDate", "200604"), new Term("movingDate", "201201", false);

luceneQuery.add(BooleanClause.Occur.Must, range);

Advantages:

- the query is readable and understandable even to new Lucene users. BTW the example is a quite simple one, it does not involve filter, search in multiple fields, query negation etc.

- I have normalized some operations that require knowledge of the lucene query hierarchy (eg. ConstantScoreQuery, ConstantScorePrefixQuery, ConstrantScoreRangeQuery or PrefixQuery vs WildcardQuery)

- the API shows available options right away using IDE auto-completion, not by looking at the Query hierarchy and its implementations

- the API does take the analyzer into account which means that I can take my input and use it without thinking much about the underlying analyzer used at indexing time. In the example, my plain Lucene rewrite of the query will very likely fail because "Atlanta" and "Peachtree" should really be "atlanta" and "peachtree". In the API, we have the analyzer and can take that into account. Likewise for synonyms, phonetic approximation etc.

Even worse, trying to search a user query containing several words in different fields is quite difficult in plain Lucene. In the new API it could look like:

String search = "harry potter";

SealedQueryBuilder qb = searchFactory.withEntityAnalyzer(Book.class);

Query luceneQuery =

qb.searchInMultipleFields()

.onField("title").boostedTo(4)

.onField("title_ngram")

.onField("description")

.onField("description_ngram").boostedTo(.25)

.forWords(search);

String search = "harry potter";

Analyzer analyzer = searchFactory.getAnalyzer(Book.class);

Map<String,Float> boostPerField = new HashMap<String,Float>(2); // boost factors

boostPerField.put( "title", (float) 4);

boostPerField.put( "title_ngram", (float) 1);

boostPerField.put( "description", (float) 1);

boostPerField.put( "description_ngram", (float) .25);

BooleanQuery luceneQuery = new BooleanQuery();

for ( Map.Entry<String, Float> entry : boostPerField.entrySet() ) {

final String fieldName = entry.getKey();

final Float boost = entry.getValue();

List<String> terms = new ArrayList<String>();

try {

Reader reader = new StringReader(search);

TokenStream stream = analyzer.tokenStream( fieldName, reader);

Token token = new Token();

token = stream.next(token);

while (token != null) {

if (token.termLength() != 0) {

String term = new String(token.termBuffer(), 0, token.termLength());

terms.add( term );

}

token = stream.next(token);

}

catch ( IOException e ) {

throw new RuntimeException("IO exception while reading String stream??", e);

}

for (String term : terms) {

TermQuery termQuery = new TermQuery( new Term( fieldName, term ) );

termQuery.setBoost( boost );

luceneQuery.add( termQuery, BooleanClause.Occur.SHOULD );

}

Did I make my case?