This dataset is read-mostly and would benefit from being cached.
I need to be able to fetch any item from the dataset, fetch any branch, and do full-text searches to the set's contents.
To be able to do the full-text search, I think in any case I need to keep all the nodes in the cache. This is also possible since the node amount is limited (1000 or so).
If some node gets updated, added or removed I can either invalidate the whole memory cache (causing a costly read-all) or update the memory cache (and do a write-through to database).
If using ehcache or some other cache product beneficial in this scenario? I could store the individual nodes in ehcache, but the hierarchical nature seems to fit awkwardly into the caching approach. Since I need to randonly access any entity (E1 or E2) I need to store E1 with eg. key "my-entities/1" and E2 with "my-entities/2". Doesnt this result in that E1.getChildren().get("e2") is different object from E2? (If 1. store E2, 2. store E1, ehcache does not realize that the E2 inside E1's children is already stored). And, if I use this approach, all leaf nodes get stored multiple times = not nice memory-consumption-wise.
Am I missing something, or would I just be better off with using just an custom tree object storing all the data in application context, and forgetting caching products altogether?
One point is that the application can be clustered in the future (now it's not) and in that situation clustering solution could make it easier to propage the updates between the clustered caches...?
What I ended up doing was creating my own in-memory cache (all items in memory) with interface
But am curious whether I reinvented the wheel unnecessarily?