Django Haystack - Filter a substring of a field using SearchQuerySet ()

I have a Django project that uses SOLR for indexing.

I'm trying to search for a substring using the Haystack SearchQuerySet class.

For example, when a user searches for the term โ€œear,โ€ he should return a record with a field with the value: โ€œSearch . โ€ As you can see, the โ€œearโ€ is a SUBSTRATION of the โ€œSearchโ€ . (obviously :))

In other words, in an ideal Django world, I would like something like:

SearchQuerySet().all().filter(some_field__contains_substring='ear') 

In the haystack documentation for SearchQuerySet ( https://django-haystack.readthedocs.org/en/latest/searchqueryset_api.html#field-lookups ), he says that only the following FIELD LOOKUP types are supported:

  • contains
  • accurate
  • gt, gte, lt, lte
  • in
  • Startswith
  • Range

I tried to use __ contains , but it behaves exactly like __ exact , which searches for the exact word (whole word) in the sentence, not the substring of the word.

I got confused because this functionality is pretty simple, and I'm not sure if I missed something, or there is another way to approach this problem (using Regex or something like that).

thanks

+5
source share
2 answers

This can be done using the EdgeNgramField field:

 some_field = indexes.EdgeNgramField() # also prepare value for this field or use model_attr 

Then for partial compliance:

 SearchQuerySet().all().filter(some_field='ear') 
+5
source

This is a mistake in a haystack.

As you said, __exact is implemented just like __contains , and therefore this function does not exist out of the box in the haystack.

The fix is โ€‹โ€‹awaiting merge here: https://github.com/django-haystack/django-haystack/issues/1041

You can reduce the latency for the fixed version as follows:

 from haystack.inputs import BaseInput, Clean class CustomContain(BaseInput): """ An input type for making wildcard matches. """ input_type_name = 'custom_contain' def prepare(self, query_obj): query_string = super(CustomContain, self).prepare(query_obj) query_string = query_obj.clean(query_string) exact_bits = [Clean(bit).prepare(query_obj) for bit in query_string.split(' ') if bit] query_string = u' '.join(exact_bits) return u'*{}*'.format(query_string) # Usage: SearchQuerySet().filter(content=CustomContain('searchcontentgoeshere')) 
0
source

All Articles