Win a copy of The Little Book of Impediments (e-book only) this week in the Agile and Other Processes forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Hibernate Search in Action: Indexing collection objects embedded properties

 
Ovidiu Guse
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Greetings,

Can anyone help me out with indexing collections?

I have the following problem:

An object CompositeItem is holding a collection named 'parts' containing instances of a Part object.

The Part object is defined through an int - 'id', String - 'name', String - 'description' and a property called 'material' of type Material.

The Material itself is defined through an int - 'id', a String - 'type' and a String - 'name'.

Now, after indexing the CompositeItem 'parts' collection, I need to be able to query the collection for Part objects with a specific 'name', having a property 'material' with a specific 'type' property.

In other words, I need to be able to perform collection searches by querying the collection element's embedded object's properties (e.g. collection.part.material.type).

Does anyone have a solution for this?

Thank you,
Ovi
 
Emmanuel Bernard
author
Ranch Hand
Posts: 62
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Because Lucene does not have the notion of document join, we cannot express the query you are looking for the way you asked for it.
In essence, when Hibernate Search index a collection, we "flatenize" all the elements in a collection and make them look like they are a single element.

Assuming a Movie and a list of Actors, you can express queries like :
- find movies where Both Cruise and McGillis are playing.
- find movies where Cruise is in the movie but not McGillis.
- find movies where One of the actors is either Cruise or McGillis
but you cannot express a query like return items where one of the actors is
Tom and his hometown is Atlanta.

However in chapter 4 I advise to think about the query with a different angle. In you example you can look for Parts of name A with a material of type T. You can retrieve the part id (using projection) and finish the rest of your restriction either in a different full-text query or by using a regular HQL query restricted by ids. In most cases you can express what you want.
I understand it's not ideal but it's a fundamental problem with Lucene and FT search technology in general.
 
Ovidiu Guse
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello Emmanuel.

Thank you for your reply.

I have two distinct solutions for the problem, both of them using pure Hibernate Search capabilities, first of them a little unpleasant and you can say "barbaric" while the other I think is best.

Solution 1:
The solution propose define a dummy property used only for indexing purposes. The property does not have to be actually declared or mapped into Hibernate, it is enough to define its getter method. Thus, we can define something like this:

public class Part {
..................
@Field(name = "materialType", index = Index.TOKENIZED)
public String getMaterialType() {
String materialType = "";

if(getMaterial() != null) {
materialType += getMaterial().getType() + " ";
}

return materialType.trim();
}
...........
}
Hibernate Search will think the materialType it is a valid property and will index it.

Solution 2: (recommended as does not require us to alter the Part class definition)

The solution is to use the 'prefix' property of the @IndexedEmbedded annotation, inside the Part class. Hibernate Search will treat prefixed entities as own properties of the indexed object. In other words,

public class Part {
.......
@IndexedEmbedded
private Material material;
.......
}

, will become:

public class Part {
.......
@IndexedEmbedded(depth = 1, prefix = "material_")
private Material material;
.......
}

Of course, inside the Material class we must annotate the properties going to be indexed.

The generated indexes for such an approach will be:

'parts.name' - 'name' property from the Part class
'parts.description' - 'description' property from the Part class
'parts.material_id' - the 'id' property from the Material class
'parts.material_name' - the 'name' property from the Material class
'parts.material_type' - the 'type' property from the Material class

Thus we will be able to build pure queries for 'parts.material_type' indexed value for example.

Hope this helps somebody.

Regards,
Ovi
 
Ovidiu Guse
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
... it will be possible with the above solutions to express queries like Emmanuel's example: return items where one of the actors is
Tom and his hometown is Atlanta.

Regards,
Ovi
 
Emmanuel Bernard
author
Ranch Hand
Posts: 62
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Ovidiu, yes I forgot this approach.
Actually there is a better version of your solutions
You can use a @ClassBridge to place the special data into the index without altering the object model and wo mapping all of the associated class (using @IndexedEmbedded).

Chapter 4 describes classbridge and chapter 8 (filters) actually uses this technique to implement filters transparently to the object model if the data is not available "naturally".
 
Ovidiu Guse
Greenhorn
Posts: 7
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Emmanuel,

Thank you for your reply. Indeed, that is a great news, I will try it out.

Thanks again,
Ovi
 
etirk etirk
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

Actually having indexedembedded will only provide each element's separate fields by name. It will still not be possible to search each list-element independently, since they are named the same.
Does anyone know of a good way of doing a multiple-join search in lucene I would be more than very happy.



A is indexed, B is indexed embedded. This will give a document index around A, and it will create fields in that document that will be named 'b.field1' and 'b.field2'. If there are 3 elements in the list b,
there will be three of theese each, with no distinction which belong to wich. Only the order in wich they appear in the indexed document can discern them from each other.
The search can be filtered using a special filter that actually matches the separate list-elements as they appear, by their ordering, but this is not a very nice solution. All fuzzyness and the actual
lucene search is lost.

Starting with indexing B is not valid here either since you do not have the possibility to join the results using the field 'a'...at least that is my experience so far, I might be wrong about this.

So the following I find extremely hard to perform with hibernate search:

Give me all A where B.field1 = 'monkey' AND (B.field1 = 'ape' or B.field2 = 'horse').
B outside () and B inside () is not the same, they are actually different list-elements, or should be that is.

This is easily done with a normal sql/hql double join using A and B1 and B2. But using HS this is not that easy.

Any suggestions to this little problem?

/K
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic