October 15, 2018 will

Adding type hints to the Django ORM

It occurred to me that Django's ORM could do with a bit of a revamp to make use of recent developments in the Python language.

The main area where I think Django's models are missing out is the lack of type hinting (hardly surprising since Django pre-dates type hints). Adding type hints allows Mypy to detect bugs before you even run your code. It may only save you minutes each time, but multiply that by the number of code + run iterations you do each day, and it can save hours of development time. Multiply that by the lifetime of your project, and it could save weeks or months. A clear win.

Typing Django Models

I'd love to be able to use type hints with the Django ORM, but it seems that the magic required to create Django models is just too dynamic and would defy any attempts to use typing. Fortunately that may not necessarily be the case. Type hints can be inspected at runtime, and we could use this information when building the model, while still allowing Mypy to analyze our code. Take the following trivial Django model:

class Foo(models.Model):
    count = models.IntegerField(default=0)

The same information could be encoded in type hints as follows:

class Foo(TypedModel):
    count: int = 0

The TypedModel class could inspect the type hints and create the integer field in the same way as models.Model uses IntegerField and friends. But this would also tell Mypy that instances of Foo have an integer attribute called count.

But what of nullable fields. How can we express those in type hints? The following would cover it:

class Foo(TypedModel):
    count: Optional[int] = 0

The Optional type hint tells Mypy that the attribute could be None, which could also be used to instruct TypedModel to create a nullable field.

So type hints contain enough information to set the type of the field, the default value, and wether the field is nullable--but there are other pieces of information associated with fields in Django models; a CharField has a max_length attribute for instance:

class Bar(models.Model):
    name = models.CharField(max_length=30)

There's nowhere in the type hinting to express the maximum length of a string, so we would have to use a custom object in addition to the type hints. Here's how that might be implemented:

class Bar(TypedModel):
    name: str = String(max_length=30)

The String class contains the maximum length information and additional meta information for the field. This class would have to be a subclass of the type specified in the hint, i.e. str, or Mypy would complain. Here's an example implementation:

class String(str):
    def __new__(cls, max_length=None):
        obj = super().__new__(cls)
        obj.max_length = max_length
        return obj

The above class creates an object that acts like a str, but has properties that could be inspected by the TypedModel class.

The entire model could be built using these techniques. Here's an larger example of what the proposed changes might look like:

class Student(TypedModel):
    name: str = String(max_length=30)  # CharField
    notes: str = ""  # TextField with empty default 
    birthday: datetime  # DateTimeField
    teacher: Optional[Staff] = None  # Nullable ForeignKey to Staff table
    classes: List[Subject]   # ManyToMany 

Its more terse than a typical Django model, which is a nice benefit, but the main advantage is that Mypy can detect errors (VSCode will even highlight such errors right in the editor).

For instance there is a bug in this line of code:

return {"teacher_name": student.teacher.name}

If the teacher field is ever null, that line with throw something like NoneType has no attribute "name". A silly error which may go un-noticed, even after a code review and 100% unit test coverage. No doubt only occurring in production at the weekend when your boss/client is giving a demo. But with typing, Mypy would catch that.

Specifying Meta

Another area were I think modern Python could improve Django models, is specifying the models meta information.

This may be subjective, but I've never been a huge fan of the way Django uses a inner class (a class defined in a class) to store additional information about the model. Python3 gives us another option, we can add keyword args to the class statement (where you would specify the metaclass). This feels like a more better place to add addtional information about the Model. Let's compare...

Hare's an example taking from the docs:

class Ox(models.Model):
    horn_length = models.IntegerField()

    class Meta:
        ordering = ["horn_length"]
        verbose_name_plural = "oxen"

Here's the equivalent, using class keyword args:

class Ox(TypedModel, ordering=["horn_length"], verbose_name_plural="oxen"):
    horn_length : int

The extra keywords args may result in a large line, but these could be formatted differently (in the style preferred by black):

class Ox(
    TypedModel,
    ordering=["horn_length"],
    verbose_name_plural="oxen"
):
    horn_length : int

I think the class keyword args are neater, but YMMV.

Code?

I'm sorry to say that none of this exists in code form (unless somebody else has come up with the same idea). I do think it could be written in such a way that the TypedModel and traditional models.Model definitions would be interchangeable, since all I'm proposing is a little syntactical sugar and no changes in functionality.

It did occur to me to start work on this, but then I remembered I have plenty projects and other commitments to keep me busy for the near future. I'm hoping that this will be picked up by somebody strong on typing who understands metaclasses enough to take this on.

Use Markdown for formatting
*Italic* **Bold** `inline code` Links to [Google](http://www.google.com) > This is a quote > ```python import this ```
your comment will be previewed here
gravatar
Anton Linevych

Model fields are not only a representation of the data type but the set of logic in which formats Django should read and write data.

For example, you will store comma separated string in a database but you field will have customto_python method that will automatically split it to a list of values.

Maybe it will make sense to add type hinting to the to_python method.

gravatar
Evgeny Denisov

Thanks for interesting food for thought.

There is a typo in the Student model: you wrote TypeModel instead of TypedModel.

gravatar
Will McGugan

Fixed, thanks.

gravatar
Thomas Weholt

And we could finally get ORM/framework agnostic models :-) Great post!

gravatar
Richard

I wrote a tiny proof of concept: https://gist.github.com/Naddiseo/d611bbd50388f267720e280de5643b90

gravatar
Will McGugan

Nice work. Anything tricky in the implementation?

gravatar
Richard

Nothing super tricky. The only thing that bit me initially was the "inherited metaclasses must be is strict subclass of parent metaclasses" thing that python has; it was easily fixed by making the metaclass extend from the python base model metaclass. The only things left are to extend the mapping from python types to django models fields, which I haven't looked to far into for things like foreign keys and optional. And the second thing is that Django currently issues a warning about redefining the model which I haven't looked to far in to. The idea could be extended to use annotated types, which would make the it easier to provide options to the django models:

class MyModel(TypedModel):
    field: Annotated[int, Max[20]] = 0  # models.IntegerField(validators=[MaxValueValidator(20)], default=0)
    fk: Annotated[OtherModel, OneToOne] = None # models.OneToOneField(OtherModel, null=True, default=None)
gravatar
Aaron

I think this would definitely help speed up development if errors can be spotted before the code is run.

It might be worth taking this idea to the development community of the main Django project for feedback.

gravatar
Paul Everitt

Nice article and nice comments. A couple of points:

  • The pydantic library has a very mature, extensible system for validation based on type hinting. It could probably be the inspiration for a lot of the extras you mention, such as constraints.

  • Consider using dataclasses to do the modeling. You get the type hint information but you also get a "field" where you can do inner-Config-class things. Such as ship along "metadata". In my system I have certain frequent metadata as custom fields, making it easy to import and say what you mean. The postinit could also be useful for doing Django-specific things.

gravatar
Nikita Sobolev

Consider using django-stubs: it provide type specs and custom mypy plugins to do typechecking with django correctly.

I wrote an article about it: https://sobolevn.me/2019/08/typechecking-django-and-drf

gravatar
Nikita Sobolev

Wow, there's no "edit". But I've made several typos and my link is not active. Here you go: https://sobolevn.me/2019/08/typechecking-django-and-drf

gravatar
Marcelo Bello

These ideas are great and definitely the way to go. Unfortunately, old projects like Django and SqlAlchemy will not move fast enough. Too much legacy, too many users used to the old ways. It is necessary for a new project to be born and kick the butts of the old guys so they can pay attention and start moving (and probably won't ever catch up).

See how FastAPI makes use of type hints and how it makes total sense and lead to a superior framework.