It is not a b-tree index but what is called an inverted index. The key in the index is roughly words in a text.
Here is how it works, every time Hibernate Core does create, update or delete an entity, Hibernate Search knows about it (by listening to the event system) and transparently updates the Lucene index (or indexes) for you. So at a given time the index is always up to date with the database. Lucene indexes are generally stored in a file system directory and hence are persistence across restart of the application. There are some other index storage strategies but FS is the most common one.
To initially index the data from your database when you initially add Hibernate Search into your application, Hibernate Search does provide a manual API to index or purge your index (ie wo waiting for Hibernate Core change events).
posted 9 years ago
Thanks Emmanuel, I got the idea of an inverted index!
But doesn't updating the index via those listerners each and every time slows up the search? Because I would prefer a search engine that is very quick and efficient rather than 'very-super' precisive in getting me the return list for iteration.
Do we have a batch based updation system also, where say, we can run it to sync-up the index data-structure (the file system directory) every night from 3AM to 4AM, where the server is in a zombie mode with seldom any hits or activity
posted 9 years ago
In most cases, people don't see performance degradation. This is particularly true if you use the JMS cluster mode. In this approach, the indexing is done on a master without affecting your slaves. Slaves do answer queries very fast thanks to their local copy of the index and push changes to the master. The master does the indexing and on a regular basis (configurable), the new index is pushed to slaves.
I describes that in chapter 5 and 10.
But be sure to not pre-suppose you need optimization wo really trying. Premature optimization is the root of all evil