First, I needed to figure out how to get data from Drupal into Lunr. Existing search modules in Drupal like Search API provide a lot of customization for indexing, but also rely on there being a Drupal backend at runtime. It felt awkward to throw users into the Search API ecosystem with the caveat that most of the functionality they get using core or Solr search would not be available. I eventually landed on using Views, with a custom display, row, and style plugin that output JSON in a very similar way to the “REST export” plugin. Views has so much built in for data transformation, it seemed like a natural fit for this use case and was already accessible to site builders.
The Views plugins I wrote output pages of arrays of objects, where every object
property is either a field to index in Lunr or a field to be displayed to end
users. A typical setup may index the
body fields, then have a
html field that contains the rendered HTML of an entity view mode. A
field is automatically added in the format
page:index, so that when Lunr
returns search results there’s a unique key to use to retrieve the display
With the data source figured out, it’s time to index our content. In a simple Lunr setup, the client will load all the source documents, create an index, and then perform a search against that index every time a search session is started. For a simple implementation with few documents this could be acceptable, but since this is all happening on the client it doesn’t scale far. Luckily, Lunr allows indexes to be pre-built and exported to JSON, then loaded in the client without needing the client to load any source documents.
The settings required to configure the indexing behavior needed to be stored somewhere, so I created a custom entity type that is a combination of what would normally be separate server, index, and view entities. This custom entity, “Lunr search”, has a form that looks like this:
Now that the pieces were in place, it was time to write the real indexing behavior. Indexing currently works like this:
- Indexing is initiated by an administrator in the UI.
- A Lunr Builder instance is created with the settings from the Lunr search entity.
- Every view row is indexed as a Lunr document.
- The configured “Display field” is stored in a separate array of objects,
which stores the
reffield and the display field.
- When a page is finished, the display document is uploaded to the server.
- Additional requests are made to page through the view.
- When there are no other pages to index, the builder exports JSON that is uploaded to the server.
- If there are any other installed languages, the entire process is repeated for each language. This allows for language specific indexes to be loaded by the client.
If it sounds complicated, it’s because it really is! But having Drupal do this work before hand means clients only have to load the compiled index to start searching.
But what about big indexes? A few people had asked me about performance and scale, and all my tests in the 100-1000 document range had worked great, but I wanted to see what practical limits Lunr had. I spun up a site and created 10,000 nodes, which were able to index fine, but after the client downloaded the huge 35mb index the main thread would noticeably lock up for a few seconds. Once loaded searches were surprisingly fast, but the main performance issue was that first interaction.
One last piece of the usability puzzle was facets, which are popular in many
Drupal search implementations. Lunr does provide the ability to search by
figure it out on their own, but I ended up proving out the integration myself
and fixing many bugs in the process. This was also when I found the first, and
so far only, part of Lunr that was lacking - the ability to use complex
conditions and condition groups. When you search with Lunr, you might enter a
black cat, which would translate to something like
contains black OR cat when parsed. To do a field search, you write something
black cat type:shorthair, which will also end up in an OR. This doesn’t
quite work because users expect facets to reduce search results, and if
I checked “shorthair” in a facet I would expect only shorthair results. You can
use a “+” symbol to make a term required, but
black cat +type:shorthair could
end up with shorthair animals that aren’t black or cats. This was a big deal to
me and I wanted to fix it, so I ended up making two queries, one for the search
black cat), and another for all the fields (
finding the intersection of both search results. Enabling users to add facets
with a bit of custom code was worthwhile, even with the last minute hair
With all the deep details out of the way, here’s how searching currently works:
- The user loads the Lunr search page.
- A web worker is created.
- The web worker loads the Lunr library, and makes an AJAX request for the index file and loads it.
- The user enters a search query, or a query string is provided.
- The window history is updated as well as the URL.
- A message is sent to the web worker.
- The worker uses the Lunr to perform the query.
- If any field searches (aka facets) are present, another query is performed and the results are merged.
- The worker sends a message back to the client with the search results, which
are just an array of reference (
- The client makes additional AJAX calls based on the results to get the user-facing HTML from the display document JSON. Typically this is less than one call per result as many results may be in the same document.
- The results are displayed to the client, along with pagers if needed.
Phew! I feel like writing all this out makes the Drupal integration seem really complex, but I think that’s just because I’ve tried to provide something that’s actually viable as a replacement for core search. Running a Drupal static site means losing the backend, and it’s easy to take out of the box features like search for granted. I’m just glad that this project worked out and is usable - now there’s an actual answer for “How do you do static search?” that doesn’t involve me hand-waving!
Here’s a demo of what the current search experience is like:
In conclusion, the Drupal Lunr integration is ready for testing and use - the
project is called “Lunr search” and can be downloaded here. There are no
composer dependencies and a default search is provided out of the box, so if
you enable it and kick off indexing at
/admin/config/lunr_search/default/index, you should be able to do a test
/search. I know I’ve talked a bit about static sites in this post,
but I think the Lunr module is useful for all sorts of sites. Having your
search be static means less uncacheable hits to your backend, and potentially a
faster experience for users, so please try it out if you’re interested!