Sorting Strings With Django Haystack And Solr
I've been working on the grants database of the Mellon Foundation's new website, a large database with information about all of the grants distributed by the Mellon Foundation since 1969.
The site is built in Django and I'm using Django Haystack with Solr for easy indexing and searching of our grant records. Solr does a few things for us that a lot of other search servers don't; it is flexible, it returns searches quickly and with fairly sophisticated logic, and has a lower barrier to entry from other search applications. Also, with the grant data, we needed to be able to search by latitude and longitdue points and quickly sum large dollar amounts, and Solr gives us both of those things.
The biggest problem I have faced so far was actually something simple that I didn't expect. On our advanced search view, I needed to sort by program title, grantee name, and location, all of which we are already storing in our Solr indexes so I figured it would be simple enough. However, when I added the sorting headers and methods to our templates and views, sorting was happening, but with no discernible pattern. I looked at the Solr schema for our application, and found out that the schema automatically generated from Haystack, defaults to using text_en for all Charfields, which Solr doesn't sort as expected. To fix, I updated the Solr schema to use strings for the fields I want to sort by.
Updated schema.xml
<field name="grantee_name" type="string" indexed="true" stored="true" multiValued="false" sortMissingLast="true" /> <field name="location_display" type="string" indexed="true" stored="true" multiValued="false" sortMissingLast="true" /> <field name="program_title" type="string" indexed="true" stored="true" multiValued="false" sortMissingLast="true" />
search_indexes.py
from haystack import indexes from .models import Grant class GrantIndex(indexes.SearchIndex, indexes.Indexable): """ This defines what fields should get indexed by the search engine. """ grantee_name = indexes.CharField(indexed=True) program_title = indexes.CharField(indexed=True) location_display = indexes.CharField(indexed=True) def get_model(self): return Grant def prepare_program_title(self, obj): return obj.program.title