The "FileIndex" mentioned in the documentation is a hypothetical subclass of haystack.indexes.SearchIndex. Here is an example:
from haystack import indexes from myapp.models import MyFile class FileIndex(indexes.SearchIndex, indexes.Indexable): text = indexes.CharField(document=True, use_template=True) title = indexes.CharField(model_attr='title') owner = indexes.CharField(model_attr='owner__name') def get_model(self): return MyFile def index_queryset(self, using=None): return self.get_model().objects.all() def prepare(self, obj): data = super(FileIndex, self).prepare(obj)
So, extracted_data will be replaced by any process you came up with to extract the PDF / DOCX content. You will then update your template to include this data.
source share