posted 10 years ago
Hi Chris,
Excellent question, and the one that is most important when deciding which storage system to use for a specific use case.
In my opinion, Neo4j shines when it comes to real-time queries of the localized data. When I say localized, I mean data closely connected to one or few nodes.
For example, finding friends of friends in the social network, or finding which products a person buys together.
Anything that requires complex queries of interconnected, joined data is a good use case for Neo4j.
As you correctly put it, if you have SQL tables that represent relationships between entities, it will most certainly be more efficient and performant to store this in Neo4j instead.
Examples are plenty: social networks, access control lists, master data management... We cover some of these use cases in the book, so take a look there as well.
Neo4j also supports ACID transactions, unlikely most of other NoSQL solutions, which makes in unique proposition for uses cases that require transaction support.
Neo4j does support storing tabular and semi-structured data (each node is Neo4j is just a bag of properties, not dissimilar to Mongo's document).
But Neo4j does not support sharding (which Mongo has built-in for example), so scaling out with large amount of tabular data will most likely be less efficient in Neo4j.
Storing large blobs of data (like PDFs or images) is not Neo4j's strength as well - I'd probably choose other db for that.
When talking about Big Data, any query use case that is likely to scan entire graph will probably not have full benefits of Neo4j Graph engine (this means that the data is not localized) - for such cases, depending on the size of the data, HDFS/Hadoop based solution bay be better option.
Aleksa