Model Inheritance ...in Django This presentation is about a feature of the Django Object Relational Mapping layer called “Model Inheritance”.
Impedance Mismatch First some background. The root of the problem that Object Relational Mapping (ORM) systems attempt to solve is a fundamental “Impedance Mismatch” between the object- oriented world inhabited by OO programming languages, such as Python, and the relational database world as defined by RDBMS’ such as PostgreSQL, Oracle and MySQL. These two worlds have very different ways of looking at how data should be organized.
Object Oriented Values One of the main tools of the OO world is inheritance. In OO inheritance, a descendant class inherits the characteristics of its ancestor. This allows common functionality to be programmed into the ancestor class, and then “specialized” sub-classes can be created, extending the functionality of the ancestor, or superclass. In this example, a User and a Customer are different kinds of specialized extensions of a Person. Additionally, an E-Commerce User inherits from both User and Customer, and therefore has all of the traits of its superclasses. It is common to organize data in OO in various hierarchies of superclasses and subclasses.
Relational Values In the world of relational databases, database tables are defined with “foreign keys” that relate them to each other. In this example, we see that each row in the User table has a person_id, which is a foreign key that relates to a record in the Person table. Data organization is done through relating records together, and then composing queries that pulls the necessary data from the database, following the relationships defined by these foreign keys.
ORM Layers are both OO and Relational Because they are designed to bridge the different approach to managing data by the two worlds, ORMs are inherently both Object Oriented AND Relational. It is exactly what is advertised by their name: Object Relational Mapping.
Building Inheritance into ORMs is difficult Trying to bring OO-style inheritance to ORMs is difficult, because the relational world doesn’t really support the concept. It opens a lot of questions of the best way to provide inheritance functionality to the programmer, in a way that isn’t a total hack on the relational side.
Django approach before now: Use Composition Up until now, Django hasn’t really supported model inheritance. When a Django application developer was presented with a situation that would best be solved with inheritance, they were advised to use a technique called composition instead. The most prominent example of this is the user profile. It would be a common case for a programmer to want to extend the User class that comes in django.contrib.auth so that they can contain attributes specific to the application. Django historically has solved this with a “user profile” - a separate class that is identified in settings.py and can be retrieved by the get_user_profile() method on User objects.
Unique Foreign Key To use composition, one defines a foreign key that is also unique on the composited class (the class that would be the subclass if we were solving this problem using OO inheritance). For example, if we hypothetically wanted to specialize a Person class and make a User, the User would have a ForeignKey field that pointed back at the Person model (the Django ORM equivalent of a relational foreign key). In addition, we would designate the field unique, ensuring that there would always only be one User per Person. Traits on the composited “superclass” would have to be accessed explicitly - they are not truly inherited by the specialization class.
This Sucks There are a number of reasons this is a suboptimal situation. 1. It makes queries more complicated, meaning that the work of defining the relationship that should be done once in data modeling is now pushed all over the application code. 2. It fails to take advantage of Python’s OO features and thus, power. 3. It creates an object model that doesn’t really best describe the real-world entities that it is trying model.
Malcolm Tredinnick This man has come to the rescue, however. This is Malcom Tredinnick, and he has for some time now been working on a branch of the Django code called QuerySet Refactor.
Queryset Refactor The QuerySet Refactor branch brings a number of new features to the Django ORM. Most notably it has brought true Model Inheritance. Despite being difficult to program, the QuerySet Refactor programmers, led by Malcolm, have managed to implement it in a couple of great ways. Furthermore, on April 26th this year, QuerySet Refactor was merged into the Django Trunk. That means that all of its features are now part of the mainstream development of Django.
Two Approaches to Model Inheritance With the merge complete, Django now offers two different approaches to Model Inheritance. An application might choose to use both approaches, since they offer advantages in different circumstances.
1. Abstract Base Classes The first approach is called Abstract Base Classes. This approach is best used when the parent superclass is never meant to be instantiated on its own. It is merely a source of common functionality that will be used by subclasses. Abstract Base Classes are specified by adding “abstract = True” in the Meta inner class. When syncdb is run, the Django ORM will not create a database table for these models. Model classes that extend the Abstract Base Class will automatically inherit fields defined in the superclass, however. The Django ORM will automatically generate corresponding columns and relations for the superclass’ fields in the table of the subclass.
Abstract Base Classes are a coding convenience only. In the case of Abstract Base Classes, the inheritance relationship is ignored at the relational level. They are essentially a kind of advanced syntactic sugar - providing a type of “include” in model definitions. Once the Django ORM has “compiled” the models into SQL, the Abstract Base Class essentially ceases to exist.
ABC Gotcha “related_name” Abstract Base Classes can carry a couple of gotchas. Let’s look at one related to the use of the “related_name” attribute in ForeignKey fields.
Specifying a ForeignKey in an ABC superclass If we put a foreign key field in an ABC, we might wish to specify a related_name in a ForeignKey field. The related_name is the name by which this class (“Person”) is known by the target model of the ForeignKey (“Company”). So a Company object has “people”. The problem comes when we have more than one concrete subclass of the ABC. We have to remember that there is no database table that corresponds to Person. Instead, the Django ORM compiles the fields of the ABC into the table definitions of the subclasses. This means that Company objects will have a “people” field that points to both User and Customer objects. Django will throw an error when syncdb is run.
Specifying a ForeignKey in an ABC superclass The suggested solution is to change the related name to something that is dynamically interpolated. By using the “%(class)s_related” notation, we will create two attributes in the Company model: “user_related” pointing to users, and “customer_related” pointing to customers.
2. Multiple Table Inheritance The second approach to Model Inheritance that Django now provides is called Multiple Table Inheritance. In this approach, Django generates a separate relational table for each model in the inheritance hierarchy. This approach is useful for circumstances where you may wish to instantiate models from both the superclass and its subclasses. In this example, we may wish to define Persons that are not Users, in addition to Users that inherit Person attributes. MTI inheritance can be engaged simply by subclassing an existing model. In this case there is no “abstract” attribute being set in an inner Meta class.
MTI makes inheritance a relationship under the hood. As mentioned, when the Django ORM generates SQL from models using MTI, a separate table is created for both the superclasses and subclasses. In addition, the inheritance between models is converted into a relation. The subclass table will have a special additional column called “<model_name>_ptr_id”, a foreign key that points at the superclass table. When an instance of the subclass model is pulled from the database, the ORM will pull the related row in the superclass table and build a model instance with data from both tables. Essentially it’s composition under the hood, but encapsulated inside the ORM model. The relational world sees it as a relation, and the OO world sees it as inheritance.
Querying the MTI superclass While its not possible to run a query directly against an ABC, it is possible to run one against an MTI superclass. But what if the superclass is directly tied to a record in the subclass table? The Django ORM will return an object of the superclass model. The associated subclass model is accessible as an attribute of the superclass model. So in this case “fred the user” is accessible as the “user” attribute of “fred the person”. This is a bit like composition again, but “fred the user” will inherit all the attributes of “fred the person”, so it is better than where we were before. If there is no corresponding user object for “fred the person”, accessing the user attribute will raise an exception.