When you need to utilize a Relational Database Management system ( RDBMS ) in your application, should you use an Object Relational Mapper ( ORM ) library?
Well, as usual, the answer is ...
it depends (but technically you can't avoid using an ORM anyways)*
For a quick intro, a RDBMS organizes information in tables and columns, which is inherently different than objects in object-oriented programming languages. ORMs seek to expose an interface in which you can interact with the data in the columns/tables using the native structures of the language.
Let's say I have a simple class User
and I want to store one in the database. This would mean I need to implement the typical CRUD operations for it. For brevity I'll just do the find method.
pythondb = ... # Connect your SQL db class User: name: str age: int @staticmethod def find(name): with db.cursor() as cur: cur.execute(""" SELECT * FROM users WHERE name = %s AND email IS NOT NULL; """, (name) # Don't get SQL injected ya'll ) user = cur.fetchone() return user # Create a user john = User.find("John")
Now let's see what it would like using an ORM library, like Prisma (it's originally written for NodeJS, there's also a python client)
pythonfrom prisma import Prisma db = Prisma() # db.user is automatically generated from a prisma.schema file async def find_user(name: str): user = db.user.find_first( where={ "name": "John", "NOT": [ {"email": None} ]} )
However the implementation can vary across libraries. Here's what Django's ORM would look like.
pythonfrom models import User john = User.objects.filter(name="John", email__is_null=False)
Now, the ORM implementations may seem straightforward enough, but whereas SQL has a standard approach to more complex queries, ORMs generally don't, and just become a wild guessing game where you rely on your language server's autocomplete. The documentation can't cover this, because the arguments and types are totally dependent on the columns of your table--such as its data type and constraints.
Okay, so maybe not evil, but potentially full of trickery.
The principal issue is due to how we don't know how the library works under the hood, and this can lead to a lot of unexpected issues.
python# This is a quick example of what a model # looks like in Django for reference from django.db import models class Address(models.Model): location = models.CharField(max_length=50, blank=True, null=True) # Field . class User(models.Model): name = models.CharField(max_length=50, blank=True, null=True) # Field . age = models.IntField(null=False) # Field . address = models.ForeignKey(Address, to_field="name", db_column="name")
1. Implicit Behavior Instead of Explicit Defintion
When you create a model without a declared primary_key column, Django will automatically create an INT column id
and make that the primary key. Although having an id INT PRIMARY KEY
column is pretty standard in RDMBS, I firmly believe that any modifications to the database absolutely need to be explicit. If there is no primary key defined, Django should error fast instead of doing things on the user's behalf unknownst to them.
2. Compatibility
Django's ORM requires a single column primary key. I don't see why you would ever not have a primary key, but this also means you can't have a composite primary key. This makes usage very awkward has you have to pretend like one of the fields is a single primary key, and treating the other field(s) in a simple WHERE clause.
3. Overfetching/Sub-optimal queries
If we have something like Additionally, when you declare a foreign key like
pythonusers = User.objects.filter(age__gt=30) # No database hit yet for user in users: print(user.address.location) # Database hit happens here
This will actually create not one query, but 1 query + an additional query for each user. This is because address
is a foreign field and does not exist in the initial query. You need to use the select_related
method to include the foreign fields in your query. This issue won't make itself visible until you start getting performance issues, and unless you were already aware of this, it might be a while before you figure out where the performance hit is coming from.
In an ironic way, my terrible experience here actually led to me preferring ORMs, because it pushed me to writing my own SQL for a few applications, which had me experiencing how inconvenient that was--leading me back to trying out ORMs. And here's ...
prisma.schema
. And as a NixOS user you already know anything with "declarative" is going to get my attention.AUTOINCREMENT
for its primary keys while MySQL uses "AUTO_INCREMENT". Not having to worry about these syntax differences makes life a lot easier, and also makes it very forgiving when you want to make your application compatible with multiple databases.Well, I think it boils down to where you're at and what you're doing.
Here are a few questions to help decide:
While both have their merits, I think for most developers using an ORM library is a no-brainer--it just saves sooo much time. And if you use Rust, Dart, TypeScript/JavaScript, Go, or Python then I highly recommend checking out Prisma.
* This is because if you're binding your native SQL to functions and objects, you're basically just making your own ORM library