[hibernate-issues] [JIRA] (HSEARCH-3499) Near-zero-downtime mass indexing

Yoann Rodière (JIRA) jira at hibernate.atlassian.net
Thu Jun 18 03:12:53 EDT 2020


Yoann Rodière ( https://hibernate.atlassian.net/secure/ViewProfile.jspa?accountId=557058%3A58fa1ced-171a-4c00-97e8-5d70d442cc4b ) *commented* on HSEARCH-3499 ( https://hibernate.atlassian.net/browse/HSEARCH-3499?atlOrigin=eyJpIjoiZTFiYjg0NDY1NDg5NGI1NjhhYmRjODQ0YWVhZjhkNzkiLCJwIjoiaiJ9 )

Re: Near-zero-downtime mass indexing ( https://hibernate.atlassian.net/browse/HSEARCH-3499?atlOrigin=eyJpIjoiZTFiYjg0NDY1NDg5NGI1NjhhYmRjODQ0YWVhZjhkNzkiLCJwIjoiaiJ9 )

> 
> 
> 
> Is there a way to access the (expected) mapping of a given index?
> 
> 

At the moment, no, there isn't a way to access it programmatically. The best you can do is to run the schema creation in a development environment and get the resulting mapping from Elasticsearch.

> 
> 
> 
> But when creating the new index (by calling the ES API directly) I don’t
> see how to post the index mappings along with the new index.
> 
> 

I think you mean you don't see how to guess the new index mapping that you must create in the new index? But just in case, here ( https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html#mappings ) is how to post the mapping while creating the index.

> 
> 
> 
> Am I taking the wrong approach or just missing how to access the updated
> IndexMetadata?
> 
> 

To be honest there is something wrong: zero-downtime reindexing as described in the documentation will only work if your mapping did *not* change. If your mapping did *not* change, you can use the rollover API ( https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-rollover-index.html ) to create the new index without knowing anything about the mapping.

This ticket, and the methodology described in the documentation, will work for periodic reindexing, but definitely are not enough for application updates. I suggest you have a look at HSEARCH-2861 ( https://hibernate.atlassian.net/browse/HSEARCH-2861 ) Open and its comments, where we discussed a few of the problems involved in zero-downtime application updates. The main problem is that old-gen instances of your applications should be allowed to write anything after you created the new index, because they will generate wrong or incomplete documents. Conversely, new instances of your application may not be able to read from the old index, since they have a different metamodel and may assume fields are present while they are not, or assume fields have a different type than they have in the old index.

So, the only "generic", guaranteed-to-work solution here would be to completely separate your applications:

* Make old-gen applications read-only, and I mean *really* read-only. They must not write *anything* , or your indexes will become out-of-sync at best, or will contain invalid data in the worst case).
* Make new-gen applications write-only. They must not handle search requests, because they don't understand the old mapping anymore.

As to how you will do that... I'll leave the routing of user requests to you, because Hibernate Search simply can't handle that: when it becomes involved in the read or write process, it's already too late. I'll just point out that if you go to such lengths, you may as well assign a different name to your index in the new version of your application ( @Indexed(name = "myindex-v2" , so you will have completely separate aliases and indexes for your two applications. You may be able to take advantage of a custom layout ( https://docs.jboss.org/hibernate/search/6.0/reference/en-US/html_single/#backend-elasticsearch-indexlayout ) strategy that automatically appends the version of your application to your index name and aliases ( myindex-appV2-000001 , myindex-appV2-read , myindex-appV2-write ), but that's about as far as Hibernate Search can help you.

( https://hibernate.atlassian.net/browse/HSEARCH-3499#add-comment?atlOrigin=eyJpIjoiZTFiYjg0NDY1NDg5NGI1NjhhYmRjODQ0YWVhZjhkNzkiLCJwIjoiaiJ9 ) Add Comment ( https://hibernate.atlassian.net/browse/HSEARCH-3499#add-comment?atlOrigin=eyJpIjoiZTFiYjg0NDY1NDg5NGI1NjhhYmRjODQ0YWVhZjhkNzkiLCJwIjoiaiJ9 )

Get Jira notifications on your phone! Download the Jira Cloud app for Android ( https://play.google.com/store/apps/details?id=com.atlassian.android.jira.core&referrer=utm_source%3DNotificationLink%26utm_medium%3DEmail ) or iOS ( https://itunes.apple.com/app/apple-store/id1006972087?pt=696495&ct=EmailNotificationLink&mt=8 ) This message was sent by Atlassian Jira (v1001.0.0-SNAPSHOT#100129- sha1:98c8491 )
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.jboss.org/pipermail/hibernate-issues/attachments/20200618/55e3d158/attachment.html 


More information about the hibernate-issues mailing list