TLDR; New neo4django release. ORM is a loaded term. I work for tips.

Earlier today, we released a new version of neo4django- the Neo4j/Django ORM integration layer. This release is mostly incremental, and contains a number of performance and stability improvements that we at Scholrly and others in the community have discovered developing against the library and using it in production.

Before I talk about those improvements, however, I’d like to focus on some disconcerting ideas I’ve heard about the library.

Some Misconceptions

Based on conversations with developers considering using neo4django, I've seen a couple related trends. If you're more interested in this current release, skip the rant.

The Misleading Term “ORM”

ORM is a loaded term, and for obvious reason- the “R” stands for “relational”. Developers using, or considering using, graph databases typically do so for two reasons. Either relational “SQL” databases didn't fit their problem, or they’re so caught up in novel database technology that they might be getting ahead of themselves (an idea I’ll explore further in a later post).

The flight from relational databases seems to have poisoned the term “relational”. That’s unfortunate. Graph databases exist for data that is too relational for relational databases- the data is so structured and connected that SQL DBs just can’t perform. Graph databases excel at retrieving this super-relational information quickly.

The Django ORM is a User Interface, Not a Database

ORM software seems to be conflated with the databases it’s commonly used to access. The role of an ORM, however, is primarily to ease the burden on the developer.

neo4django aims to implement a superset of the Django ORM’s developer interface functionality. To be clear, the library doesn’t use any of the SQL-specific machinery that backs the Django ORM. Instead, it takes advantage of the Django model definition language.

Why? To simplify the switch from a relational database to a graph database? Yes. But also because the Django ORM is a great way to quickly model a domain and get to work. I’d argue that its primary value is as a developer user interface, and we hope to bring that value to our graph-backed product.

In Other Words

neo4django’s close association with the term ORM is costing it potential users. I’m still looking for an alternative, and the best I’ve heard so far is OGM- Object Graph Mapper. In future versions, I plan to focus more on marketing and user education; I hope this revised terminology will help grow the community.

In This Release

Neo4j 1.7 & 1.8.M07 Support

The library is now tested against Django 1.4.1 and the most recent releases of Neo4j 1.5-1.8.

Relationship Improvements

Relationship querysets are a bit more useful now, thanks to our summer intern Chris. In particular, he's implemented filter() and a couple other tremendously helpful methods you'd expect. They have a long way to go, but this is a great start.

Other Improvements

The bulk of this release is incremental, community-driven progress. I very much appreciate the time the community has put in to improve the project. Check out the changelist for more details on their contributions, and the authors list so you know whom to thank :)

What’s with the version skip?

Though this is mostly an incremental release, the time between releases has increased; it seemed like the version should as well. As an added bonus, the version number still corresponds to the latest stable Neo4j version the library supports.

Unfortunately, that trend won’t hold- eventually the library will need a major version number. It has to grow up eventually.

Supporting the Project

I’m the technical lead at a small startup outside the Valley- which means I’m strapped for time and cash. If you’d like to see the library improve, consider contributing code & documentation, leaving a small donation, or even sponsoring our time at an East Coast accelerator.

Future Plans

In the next release, I hope to focus more on user-education and empowering more subtle use of the graph.

My feature priorities include

  • Better exposing and documenting the Gremlin and Cypher machinery that powers the library. This should enable more complicated use cases and open up the path to aggregates.
  • Further relationship improvements. In particular, further fleshing out relationship querysets with methods like create().
  • And, of course, performance. Performance, performance, performance.

Let me know if there are other improvements you’d like to see.