This blog post will delve into Generic Foreign Keys in Django. We'll explore what they are, when they are helpful, and how to define them in a Django model.
Please note that this post primarily focuses on understanding the concept behind Generic Foreign Keys and does not delve into the debate of whether they are the best choice from a database design perspective or whether they should be used.
Foreign Keys
A foreign key is a critical element in a database table, often comprising one or more columns, whose values must correspond to the values in another table's column(s). FOREIGN KEY constraints play a crucial role in maintaining referential integrity, ensuring that if a value in one column (A) references a value in another column (B), then column B must exist.
However, what if we want to establish a reference to any table using a foreign key? Consider a scenario where we want to keep track of "likes" for various types of content, such as posts and courses.
class Post(models.Model):
...
class Like(models.Model):
post = models.ForeignKey(
Post,
on_delete=models.SET_NULL,
null=True,
blank=True
)
user = models.ForeignKey(
User,
on_delete=models.SET_NULL,
null=True,
blank=True
)
Suppose we introduce a new model, "Course," and we want to allow users to like courses as well.
We face a choice:
Create separate models to track likes for each type of content or
Employ a single "Like" model with a Generic Foreign Key.
Generic Foreign Key
To understand how Generic Foreign Keys work, let's revisit SQL's standard foreign key concept. In a typical foreign key, one column references the primary key of a predefined table. We must devise a schema that accommodates this flexibility to make it generic and reference any table.
In a typical foreign key setup, a single column references the primary key of a specific, predefined table. For instance, consider the following SQL code:
CREATE TABLE posts (
post_id SERIAL PRIMARY KEY,
user_id INT NOT NULL,
content TEXT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
-- Add any other columns related to a post here
);
FOREIGN KEY (user_id) REFERENCES users(user_id);
Here, the “user_id”
column in the "posts" table references the “user_id”
column in the "users" table. This enforces referential integrity and ensures that the values in the "user_id" column of "posts" correspond to existing values in the "users" table.
Now, let's consider a scenario where we want to create a reference to various tables, not just a specific one. SQL doesn't provide a built-in mechanism for this. To achieve this flexibility, we need to adjust our schema definition.
To make a foreign key generic and capable of referring to any table, we can create two columns:
A column that stores the primary key value without the constraints of a foreign key.
Another column indicates which table the reference pertains to.
In this way, we create a more versatile structure:
CREATE TABLE likes (
like_id SERIAL PRIMARY KEY,
user_id INT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
content_id INT NOT NULL,
content_type VARCHAR(255) NOT NULL,
-- Add any other columns related to likes here
);
-- You can define foreign keys to link to the User table.
-- FOREIGN KEY (user_id) REFERENCES User(user_id);
In this new setup, the “content_id”
column stores the primary key value, and the “content_type”
column specifies the table to which it refers. This approach allows us to reference multiple tables dynamically based on the value in “content_type”
.
By employing this schema modification, we can achieve the flexibility needed to create generic foreign keys, enabling us to reference various tables within our database.
INSERT INTO Like (user_id, content_id, content_type)
VALUES (1, 123, 'posts');
-- Assuming user with ID 1 liked post with ID 123
INSERT INTO Like (user_id, content_id, content_type)
VALUES (2, 456, 'courses');
-- Assuming user with ID 2 liked course with ID 456
INSERT INTO Like (user_id, content_id, content_type)
VALUES (3, 789, 'digital_goods');
-- Assuming user with ID 3 liked digital goods with ID 789
We can effectively implement generic foreign keys by querying the "content_type" key from the "likes" table and joining it with the corresponding table in separate queries.
It's important to note that "content_id" is an integer field that should match the data type of the referenced table's primary key. For instance, if the primary key is a UUID, "content_id" should also be a UUID, or we can use VARCHAR to handle various data types.
ContentType Model
In Django, table names differ from model names, consisting of the Django app and model name combined. Django provides an inbuilt model for tracking defined models in your project, known as "ContentType."
Instead of manually storing this information, we reference the "ContentType" table when defining a Generic Foreign Key in a Django Model.
Django includes a contenttypes application that can track all of the models installed in your Django-powered project, providing a high-level, generic interface for working with your models.1
Here's how it looks in Django:
from django.contrib.contenttypes.fields import GenericForeignKey
from django.contrib.contenttypes.models import ContentType
from django.db import models
class Like(models.Model):
user = models.ForeignKey(
User,
on_delete=models.SET_NULL,
null=True,
blank=True
)
content_type = models.ForeignKey(
ContentType,
on_delete=models.CASCADE
)
object_id = models.PositiveIntegerField()
content_object = GenericForeignKey("content_type", "object_id")
class Meta:
indexes = [
models.Index(fields=["content_type", "object_id"]),
]
Note the index on fields "content_type" and "object_id"; it optimizes fetching records for specific content types, such as Posts, Courses, and Digital Goods.
GenericForeignKey
There are three parts to setting up a
GenericForeignKey
:
Give your model a
ForeignKey
toContentType
. The usual name for this field is “content_type”.Give your model a field that can store primary key values from the models you’ll be relating to. For most models, this means a
PositiveIntegerField
. The usual name for this field is “object_id”.Give your model a
GenericForeignKey
, and pass it the names of the two fields described above. If these fields are named “content_type” and “object_id”, you can omit this – those are the default field namesGenericForeignKey
will look for.Unlike for the
ForeignKey
, a database index is not automatically created on theGenericForeignKey
, so it’s recommended that you useMeta.indexes
to add your own multiple column index. This behavior may change in the future.2
Unlike ForeignKey, a database index is not automatically created on the GenericForeignKey, so adding your custom multiple-column index using Meta.indexes is advisable.
Working with the ORM
There are two approaches to working with GenericForeignKeys:
Passing "content_type" and "object_id" separately or
Using "content_object" directly.
Passing content_type
and object_id
Separately
Explicit Control
Using content_type
and object_id
separately provides more explicit control over the foreign key relationship. You can set these fields independently, allowing you to manipulate the like object's references as needed.
Performance
In some cases, especially when working with a large number of likes, directly setting content_type
and object_id
can be more efficient because it avoids the overhead of creating and managing a content object instance.
Complex Relationships
Passing them separately can be beneficial when dealing with complex relationships or situations where you must perform additional logic based on the content type and object ID.
Example
# Creating a like for a Post
post_content_type = ContentType.objects.get(
app_label="cms",
model="post"
)
user = User.objects.get(id=1)
like = Like(
user=user,
object_id=123, # ID of the post
content_type=post_content_type,
)
like.save()
# Creating a like for a Course
course_content_type = ContentType.objects.get(
app_label="cms",
model="course"
)
user = User.objects.get(id=1)
like = Like(
user=user,
object_id=456, # ID of the course
content_type=course_content_type,
)
like.save()
# Creating a like for a DigitalGood
dg_content_type = ContentType.objects.get(
app_label="cms",
model="digital_good"
)
user = User.objects.get(id=1)
like = Like(
user=user,
object_id=789, # ID of the Digital Good
content_type=dg_content_type,
)
like.save()
Using content_object
Directly
Convenience
The content_object
approach is more convenient and concise in many cases. It allows you to work with content objects directly without explicitly setting content_type
and object_id
.
Readability
Code using content_object
tends to be more readable and self-explanatory, especially for developers less familiar with the underlying database schema.
Django's Design Philosophy
Django's ORM is designed to make everyday tasks straightforward, and using content_object
aligns with this philosophy.
In most cases, using content_object directly is recommended because it simplifies your code and improves readability. However, suppose you have specific use cases that require fine-grained control over the foreign key relationship or are working with many records and need to optimize performance. In that case, you might choose to pass content_type and object_id separately.
Example
# Creating a like for a Post
post = Post.objects.get(id=123)
user = User.objects.get(id=1)
like = Like(user=user, content_object=post)
like.save()
# Creating a like for a Course
course = Course.objects.get(id=456)
user = User.objects.get(id=2)
like = Like(user=user, content_object=course)
like.save()
# Creating a like for a DigitalGood
digital_good = DigitalGood.objects.get(id=789)
user = User.objects.get(id=3)
like = Like(user=user, content_object=digital_good)
like.save()
Conclusion
In conclusion, Generic Foreign Keys in Django provide a powerful way to handle references to multiple tables within your database schema dynamically. While traditional foreign keys bind to specific tables, Generic Foreign Keys can reference various tables based on a content type indicator. This flexibility empowers developers to build more adaptable and extensible data models, enhancing the versatility of their Django projects. Whether you choose to pass “content_type”
and “object_id”
separately or use the convenient “content_object”
Understanding and utilizing Generic Foreign Keys can significantly improve your data modelling capabilities in Django applications.
https://docs.djangoproject.com/en/4.2/ref/contrib/contenttypes/#module-django.contrib.contenttypes
https://docs.djangoproject.com/en/4.2/ref/contrib/contenttypes/#django.contrib.contenttypes.fields.GenericForeignKey