Python Django: join a single table

I am trying to use the ltree extension in PostgreSQL to create a full-text search engine.

My model looks like this (it’s a bit simplified):

 from django.db import models class Addresses(models.Model): name = models.CharField(max_length=255) path = models.CharField(max_length=255) 

So, the data in this table will look like this:

 id | name | path ---------------------------- 1 | USA | 1 2 | California | 1.2 3 | Los Angeles | 1.2.3 

I want to perform a full-text search on the aggregated name of each object. Basically I need to convert each row into a table in the following format in order to do a search:

  id | full_name | path ------------------------------------------------- 1 | USA | 1 2 | California USA | 1.2 3 | Los Angeles California USA | 1.2.3 

I do it this way, so the user can execute queries like "los ang cali" or the like. I have no problem doing this with a raw PostgreSQL query:

 SELECT *, ts_rank_cd(to_tsvector('english', full_address), query) AS rank FROM (SELECT s.id, s.path, array_to_string(array_agg(a.name ORDER BY a.path DESC), ' ') AS full_address FROM "Addresses" AS s INNER JOIN "Addresses" AS a ON (a.path @> s.path) GROUP BY s.id, s.path, s.name ) AS subquery, to_tsquery('english', %s) as query WHERE to_tsvector('english', full_address) @@ query ORDER BY rank DESC; 

This works fine, but when using RawQuerySet I cannot use things like .filter() , .group_by() , pagination, etc.

The main limitation for playing in Django is JOIN :

 JOIN "Addresses" AS a ON (a.path @> s.path) 

he used to combine all the ancestors of each element, and then grouped them using the array_agg() , array_to_string , so the output of these functions can be used further in full-text search .

If anyone has ideas on how to implement such things using Django ORM , please let me know.

+6
source share
2 answers

Summary

You need an unmanaged model supported by VIEW.

Unmanaged model.

Creating an unmanaged model is achieved by setting the managed meta-option of the model to false.

If False, no operations to create or delete database tables will be performed for this model. This is useful if the model is an existing table or database view created by some other means. This is the only difference when managing = False. All other aspects of model processing are exactly the same as normal. This one includes

Emphasis is mine.

Thus, if you create an unmanaged model, it can be represented by a view in the database, and you have access to .filter() , .group_by() .

View.

Presentation is your request.

 CREATE OR REPLACE view full_address_tree AS SELECT a.*, s.id, s.path, array_to_string(array_agg(a.name ORDER BY a.path DESC), ' ') AS full_address FROM "Addresses" AS s INNER JOIN "Addresses" AS a ON (a.path @> s.path) GROUP BY s.id, s.path, s.name 

Model creation

 class FullAddressTree(models.Model): # copy paste the fields from your Addresses model here sid = models.IntegerField() sid = models.CharField() class Meta: # this is the most important part managed = False db_table = 'full_address_tree' # the name of the view 

So now you have a model that can be used for full-text search without the need to use raw queries. So you have all the features of Django ORM at your disposal.

Migrations.

If you want to perform the migration, you will find that. /manage.py makemigrations results in a dummy migration .. /manage.py sqlmrate will show that no sql queries are being executed for this migration.

To fix this and create the created view, automatically add a RunSQL call to the operations list in this migration.

 migrations.RunSQL(''' COPY PASTE SQL QUERY FROM ABOVE ''') 

Warnings

The unmanaged model you created is read-only. An attempt to create, replace, update, or delete will fail. If you need this functionality, you will need an INSTEAD trigger.

+2
source

So big +1 @ shan-wang for their suggestion on django-mptt. Using this problem is related to your problem, because all tree-like operations in MPTT work like regular QuerySet and therefore are bound to annotate and aggregate . The only thing I’m not sure is that your problem is heavily inserted. If you plan to immediately dump a lot of data into a table, then there is nothing to do. If you are going to change the tree often, this can be a problem. For a good description of what MPTT is and how it works http://www.sitepoint.com/hierarchical-data-database-2/

Anyway, your initial problem of getting all the ancestors of the node then becomes la_node.get_ancestors() . This concerns the connection limit that you mentioned, which should allow you to reformulate the rest of the request.

+1
source

All Articles