ValuesQuerySet changes POST-qs-rf merge

Update: As Malcolm has commented below, the issue described here was in fact a bug which has now been fixed.

Just a quick note regarding of one area of change since the merge of the queryset-refactor branch into the trunk which has broken a specific use-case of the ValuesQuerySet.

To reduce the number of queries which are executed when accessing related items from a model I have one particular view which uses values() to return ValuesQuerySet representing the associated instance items. I then use a list comprehension to build a list of associated items and bind them to the parent object instance. This may sound bizzar, and if there’s another way which gives a ~3x reduction in the query count (from around 43 down to 15 in a specific case), I’m all ears - and no, using select_related() would not help in this case. :)

This mocked-up sample code gives you an idea of what’s going on:

post_ids = [post.id for post in posts]
message_items = Messages.objects.filter(post__in=post_ids).values()
for post in posts:
    post.message_items = [message for message in message_items if message['post_id'] == post.id]

However, since the merge, message_items will be re-evaluated (causing the query to be re-executed) for each iteration of the list comprehension.

The fix: simply cast the ValuesQuerySet returned by values() to a standard list.

messages_items = list(Messages.objects.filter(post__in=post_ids).values())

Note of caution: You probably wouldn’t want to do this if your objects are potentially large (they contain a large amount of text) or you’re returning a lot at once as it will eat up memory.

2008-04-28

Comments

  1. Malcolm Tredinnick §

    This is fixed now (in [7497]). You might want to consider filing a bug report next time, since this kind of regression is probably going to be unintended.

    2008-04-28

  2. Doug Napoleone §

    I see someone has noted the bug fix.
    So I guess I will give some advice on optimizing your queries instead!

    Three things:

    In these examples you have the following models:

    class Child(Model):
    name = TextField()
    desc = TextField()
    # lots of other fields

    class Parent(Model):
    child = ForeignKey(Child)
    # other fields

    1. You can limit the attributes returned by values to just those which are of interest to you

    res = Child.objects.values('name')

    res, when evaluated, will be a list of dicts with just a 'name' key.

    2. Look at the documentation on select_related to pull in related objects

    res = Parent.objects.all().select_related()

    without select_related 'res[0].child would result in a query. with it al that data is sucked in as part of the res evaluation (could be a good or bad thing depending).

    3. Read up on extra() and the select argument in particular. Many times you just want to pull in one or two pieces of related data from a relation. Select is the way to go here.

    res = Parent.objects.extra(select={'child_name': 'SELECT “name” FROM “myapp”.”child” WHERE “myapp”.”child”.”id” = “myapp”.”parent”.”child_id”'})

    now each 'Parent' object has a 'child_name' attribute from the child relation. This can be quite useful when all you want is one small piece of data from a relation. It is even more powerful as you can do COUNT and other operations. WARNING: there are issues as the quoting done here works for sqlite3 and postgresql, but not for MySQL. There is a better way, but it hides a bit of what is going on and I felt showing the raw SQL was better.

    2008-04-28

  3. Andrew §

    Hi Doug,

    Thanks for the feedback but your examples are not applicable in this case. If the models were related in the way you describe, I could just use select_related, but unfortunately they're not.

    The relationship is on the other side of the query. Using your example above the relationship would be:

    class Parent(models.Model):
    otherfield = models.CharField()

    class Child(models.Model):
    parent = models.ForeignKey(Parent, related_name='child_items')
    name = models. TextField()
    desc = models. TextField()

    The optimization described above was for a specific case, and was done purely to reduce the number of queries being executed, and in this case, I needed all fields (but wasn't bothered if they were instances of the model class themselves).

    2008-04-28