28 Mar 2016

Optimizing Django ORM queries

I’m have been working on a project using Django and Django-REST-Framework. A particular endpoint of the application executing number of database queries happening . Once you know its happening , then need you need to figure out why is it happening.

Findings:

  • select_related and prefetch_related
    Most of the time application models have reference to some other model referred by ForeignKey or ManyToManyor OnetoOne.
    When a query is made for base relation and when you use referred relation it will make extra query, this extra query can be avoided.
    using select_related and prefetch_related removed lot of unnecessary queries. A big boost.

  • Structure queries well

for person in Person.objects.all():
    if person.create_date >=today_date and person.status not in [X, Y]:
    # do something with person

Above loop can be avoided by constructing the simple query.

for person in Person.objects.filter(created_date__gte=today_date).exclude(status__in=[X, Y]):
    # do something with person
  • Beware of the response data (django-rest-framework)
    A serializer may have all the models fields defined, but for specific endpoint you may not need all the fields to be serialized especially the related fields or the model properties which results in database queries. You can write some custom serializers.

Above three gave quite a boost to database performance, to one API end-point i managed to get the number of queries from 469 to 91.

Apart from this stuff one go for database level optimization. In my case it is PostgreSQL database and once can consider to set the following database settings:

shared_buffers          25% of RAM
effective_cache_size    75% of RAM
work_mem                5MB
wal_buffers             16MB
checkpoint_segments     10
maintenance_work_mem    50MB for each GB of RAM

Hope you find this article helpful.

No comments:

Post a Comment