A Vukotic

Author
+ Follow
since Nov 10, 2014
Cows and Likes
Cows
Total received
5
In last 30 days
0
Total given
0
Likes
Total received
2
Received in last 30 days
0
Total given
0
Given in last 30 days
0
Forums and Threads
Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by A Vukotic

Hi Samir,

They certainly can.
Neo4j is ACID-compliant and can participate in XA transactions with supported JDBC connections and/or any other XA resource (JMS connections...)

If you use Spring Framework, Neo4j can participate in transactions using @Transactional annotation as well.

For more details you can take a look at Chris Gioran's blog (a bit old, but still valid in Neo4j XA space):
http://digitalstain.blogspot.co.uk/2010/10/neo4j-internals-transactions-part-2.html

Aleksa
6 years ago
Just realized I had a typo in the second query. It should be:
MATCH (user:USER)-[rel:IS_FRIEND_OF]-x
where rel.timestamp>='20140301'
AND rel.timestamp<='20140331'
AND count(rel) > 30
return count(user)
6 years ago
Chris,

There are 2 ways you can access Neo4j database at present:
- REST, which can be used from any programming language that supports HTTP
- Java API - which you can use from any JVM based language, including scala

That said, there are a number of libraries/wrappers for different programming languages that you can use.
Neo Technology maintains the list of active ones here:
http://neo4j.com/contrib/

Aleksa
6 years ago
Hi Chris,

Excellent question, and the one that is most important when deciding which storage system to use for a specific use case.

In my opinion, Neo4j shines when it comes to real-time queries of the localized data. When I say localized, I mean data closely connected to one or few nodes.
For example, finding friends of friends in the social network, or finding which products a person buys together.
Anything that requires complex queries of interconnected, joined data is a good use case for Neo4j.
As you correctly put it, if you have SQL tables that represent relationships between entities, it will most certainly be more efficient and performant to store this in Neo4j instead.
Examples are plenty: social networks, access control lists, master data management... We cover some of these use cases in the book, so take a look there as well.

Neo4j also supports ACID transactions, unlikely most of other NoSQL solutions, which makes in unique proposition for uses cases that require transaction support.

Neo4j does support storing tabular and semi-structured data (each node is Neo4j is just a bag of properties, not dissimilar to Mongo's document).
But Neo4j does not support sharding (which Mongo has built-in for example), so scaling out with large amount of tabular data will most likely be less efficient in Neo4j.

Storing large blobs of data (like PDFs or images) is not Neo4j's strength as well - I'd probably choose other db for that.

When talking about Big Data, any query use case that is likely to scan entire graph will probably not have full benefits of Neo4j Graph engine (this means that the data is not localized) - for such cases, depending on the size of the data, HDFS/Hadoop based solution bay be better option.

Aleksa
6 years ago
Ninad,

Neo4j provides import feature that can be used to import data from CSV files.
As long as you can export data into the CSV format from your source database, you should be good to go
http://neo4j.com/developer/guide-importing-data-and-etl/

There is a good blog about migrating from MySQL:
http://neo4j.com/blog/data-migration-between-mysql-and-neo4j/

Aleksa
6 years ago
Hey Ninad,

This is one important and big question - and the one we have tried to answer in Neo4j in Action book.

However, in few pointers, the key things are:

- Neo4j allows for fast and efficient querying of highly connected data - it can traverse millions hops per second (hop is a jump from one node to another via relationship that connects them)
- In Neo4j, all data must live on a single machine/disk. This means that if you have cluster of Neo4j nodes, each will have full copy of the data. Neo4j core data structures (nodes and relationships) are very small though - (9 bytes for a node and 33 bytes for a relationship) - making it easy to fit hundreds of millions of nodes/relationships on a single node
- As for availability, Neo4j has a HA setup, with consist of master-slave Neo4j server cluster - this is commercial feature though and only available with licensed product.

Hope this helps, for more details, please read the book!

Aleksa
6 years ago
Hi Mike,

thanks for taking a part in this Q&A session.

Doing the same query in Cypher (Neo4j query language) would be equally simple:

MATCH (user:USER)-[rel:IS_FRIEND_OF]-x return user, count(rel) order by count(rel) DESC

One caveat is that query like this would scan entire graph (all users and all their IS_FRIEND_OF relationship), and would require to store the counts in memory for sorting.
For best performance, query should have as little start nodes as possible and touch as little properties as possible. However, this query would still perform reasonably well for a db of few million users for example.

Neo4j's sweet spot is the real time analytics for a few starting nodes (for example what does customer buys at the same time with products A and B).
As for the time based query, that is possible as well, but it would be up to application to store relevant timestamps as a relationship property. So the query would look like this:

MATCH (user:USER)-[rel:IS_FRIEND_OF]-x
where rel.timestamp>='20140301'
AND rel.timestamp>='20140301'
AND count(rel) > 30
return count(user)

The Neo4j largest setup I have been involved with had ~1TB of data, was running on 3 nodes and had approx. 50 million nodes and few billion relationships.

Aleksa
6 years ago
Hi Ferdinand,

Neo4j is indeed cloud-friendly technology.
You can easily deploy Neo4j server in the cloud and access it via built-in REST interface.
There are also a host of community-built project that will allow you to easily deploy Neo4j in cloud infrastructure like AWS - for example:
https://github.com/neo4j-contrib/neo4j-puppet/blob/master/README.CLOUDFORMATION.md

If you're after Neo4j-as-a-service offerings, there is graphene-db.
I haven't used it myself, but their offering looks comprehensive:
http://www.graphenedb.com/
Check out this tutorial as well:
http://inserpio.wordpress.com/2014/02/13/neo4j-graphenedb-it-has-never-been-so-simple/

Aleksa
6 years ago
Chris,

This is a great question.

Scaling graphs horizontally in terms of storage (data sharding) is a hard problem. Traversing graph with the cost of network hop for a relationship can quickly become too expensive.
So in those terms, Neo4j will not compete with likes of Hadoop in the near future.

What Neo4j does provide is a very intelligent caching engine (both JVM heap caching and off-memory caching), which should speed things significantly if entire data set can fit into memory.
For cases where data is larger then the amount of memory available, you can run Neo4j in a clustered master-slave setup. Each node (master and slave) will have the entire copy of the data, but by smart routing within your application you keep different data sets cached on different nodes, making traversals scalable on large data sets.
In addition, code Neo4j data structures are very small - 9 bytes for a node and 33 bytes for a relationships - which means that you can fit billions of nodes and relationships in a standard server RAM memory for example. It is properties that add size to the data - and if you are after core traversal performance, you can store only minimal amount of information in Neo4j and use other storage system for persist other data - this is what polyglot persistence i all about.

Aleksa
6 years ago
Thanks, Krystian!

While I can't comment on the writing style (that's for the readers to judge), Neo4j In Action introduces reader to the Neo4j graph database using a lot of hands-on examples and executable code - not dissimilar of Spring in Action book.

Neo4j in Action is written primarily for developers making a foray into graph databases and I hope you will find it a useful resource as such.

Aleksa
6 years ago