October 15, 2018 will

Adding type hints to the Django ORM

It occurred to me that Django's ORM could do with a bit of a revamp to make use of recent developments in the Python language.

The main area where I think Django's models are missing out is the lack of type hinting (hardly surprising since Django pre-dates type hints). Adding type hints allows Mypy to detect bugs before you even run your code. It may only save you minutes each time, but multiply that by the number of code + run iterations you do each day, and it can save hours of development time. Multiply that by the lifetime of your project, and it could save weeks or months. A clear win.

Typing Django Models

I'd love to be able to use type hints with the Django ORM, but it seems that the magic required to create Django models is just too dynamic and would defy any attempts to use typing. Fortunately that may not necessarily be the case. Type hints can be inspected at runtime, and we could use this information when building the model, while still allowing Mypy to analyze our code. Take the following trivial Django model:

class Foo(models.Model):
    count = models.IntegerField(default=0)

The same information could be encoded in type hints as follows:

class Foo(TypedModel):
    count: int = 0

The TypedModel class could inspect the type hints and create the integer field in the same way as models.Model uses IntegerField and friends. But this would also tell Mypy that instances of Foo have an integer attribute called count.

But what of nullable fields. How can we express those in type hints? The following would cover it:

class Foo(TypedModel):
    count: Optional[int] = 0

The Optional type hint tells Mypy that the attribute could be None, which could also be used to instruct TypedModel to create a nullable field.

So type hints contain enough information to set the type of the field, the default value, and wether the field is nullable--but there are other pieces of information associated with fields in Django models; a CharField has a max_length attribute for instance:

class Bar(models.Model):
    name = models.CharField(max_length=30)

There's nowhere in the type hinting to express the maximum length of a string, so we would have to use a custom object in addition to the type hints. Here's how that might be implemented:

class Bar(TypedModel):
    name: str = String(max_length=30)

The String class contains the maximum length information and additional meta information for the field. This class would have to be a subclass of the type specified in the hint, i.e. str, or Mypy would complain. Here's an example implementation:

class String(str):
    def __new__(cls, max_length=None):
        obj = super().__new__(cls)
        obj.max_length = max_length
        return obj

The above class creates an object that acts like a str, but has properties that could be inspected by the TypedModel class.

The entire model could be built using these techniques. Here's an larger example of what the proposed changes might look like:

class Student(TypedModel):
    name: str = String(max_length=30)  # CharField
    notes: str = ""  # TextField with empty default 
    birthday: datetime  # DateTimeField
    teacher: Optional[Staff] = None  # Nullable ForeignKey to Staff table
    classes: List[Subject]   # ManyToMany 

Its more terse than a typical Django model, which is a nice benefit, but the main advantage is that Mypy can detect errors (VSCode will even highlight such errors right in the editor).

For instance there is a bug in this line of code:

return {"teacher_name": student.teacher.name}

If the teacher field is ever null, that line with throw something like NoneType has no attribute "name". A silly error which may go un-noticed, even after a code review and 100% unit test coverage. No doubt only occurring in production at the weekend when your boss/client is giving a demo. But with typing, Mypy would catch that.

Specifying Meta

Another area were I think modern Python could improve Django models, is specifying the models meta information.

This may be subjective, but I've never been a huge fan of the way Django uses a inner class (a class defined in a class) to store additional information about the model. Python3 gives us another option, we can add keyword args to the class statement (where you would specify the metaclass). This feels like a more better place to add addtional information about the Model. Let's compare...

Hare's an example taking from the docs:

class Ox(models.Model):
    horn_length = models.IntegerField()

    class Meta:
        ordering = ["horn_length"]
        verbose_name_plural = "oxen"

Here's the equivalent, using class keyword args:

class Ox(TypedModel, ordering=["horn_length"], verbose_name_plural="oxen"):
    horn_length : int

The extra keywords args may result in a large line, but these could be formatted differently (in the style preferred by black):

class Ox(
    TypedModel,
    ordering=["horn_length"],
    verbose_name_plural="oxen"
):
    horn_length : int

I think the class keyword args are neater, but YMMV.

Code?

I'm sorry to say that none of this exists in code form (unless somebody else has come up with the same idea). I do think it could be written in such a way that the TypedModel and traditional models.Model definitions would be interchangeable, since all I'm proposing is a little syntactical sugar and no changes in functionality.

It did occur to me to start work on this, but then I remembered I have plenty projects and other commitments to keep me busy for the near future. I'm hoping that this will be picked up by somebody strong on typing who understands metaclasses enough to take this on.

Use Markdown for formatting
*Italic* **Bold** `inline code` Links to [Google](http://www.google.com) > This is a quote > ```python import this ```
your comment will be previewed here
gravatar
Anton Linevych

Model fields are not only a representation of the data type but the set of logic in which formats Django should read and write data.

For example, you will store comma separated string in a database but you field will have customto_python method that will automatically split it to a list of values.

Maybe it will make sense to add type hinting to the to_python method.

gravatar
Evgeny Denisov

Thanks for interesting food for thought.

There is a typo in the Student model: you wrote TypeModel instead of TypedModel.

gravatar
Will McGugan

Fixed, thanks.

gravatar
Thomas Weholt

And we could finally get ORM/framework agnostic models :-) Great post!