Making Ghost and Algolia work better together

(Post in progress! Writing in public!)

Making Ghost and Algolia work better together
Searching for a Ghost search integration that doesn't overwrite all your hard work setting Algolia up?

(Post in progress! I'm having a "write in public" moment!)

Prologue

The built-in Ghost search ("sodo-search") is certainly convenient, but it only searches post titles, tags, the excerpt, and the author's name. It doesn't search the full text of the post, and it doesn't search pages at all. It's not readily customizable, as it's a separate package loaded in {{ghost_head}}. You could fork the package, make your changes, and load your custom version. That can be done (assuming you've got the coding skills to do so) fairly easy on self hosting, or requires replacing all of {{ghost_head}} on Ghost Pro.

Ghost's built-in search actually runs in the browser. When the page loads, the browser runs some JavaScript that grabs the post content using the Content API, and generates an index. To keep the amount of content to be indexed small, the built in search doesn't index the full post content. For sites with a huge number of posts, even that's too slow.

If you don't have 10,000 posts and you are carefully optimizing each post's title, excerpt, and tags to reflect all reasonable keywords a user might use, then the built-in search may work fine for you. (In a bad case of "the cobbler's children go barefoot", that's currently the case for my own blog.) If you want to write lengthy posts that meander from topic to topic, then you may want to integrate a search provider. Happily, Ghost has an Algolia integration.

How it works:

Setting up Algolia for searching Ghost content has three major parts:

1) Set up the front-end - the page / popup that actually does the Algolia search. Minimally, a search box, some code to ask the Algolia server for matches to the search, and some code to put those matches onto the page. Instantsearch.js is a nominally quick way to do that, although I sometimes find that I'm struggling with conflicts between the css it provides and the css provided by some themes.

2) Import current post content using the @tryghost/algolia package. (Can be skipped if you're starting from zero posts.)

3) Set up Netlify using the algolia-netlify package . You're going to set up a couple of cloud functions whose job it is to detect (via webhook) whenever a post gets created/edited/deleted, and to update Algolia.

There are some quirks in these packages. Namely, they actually hardcode how the index is going to be configured. And the problems are buried in dependencies of the main package.

In algolia-indexer/lib/IndexFactory.js:

// Any defined settings will override those in the algolia UI
// TODO: make this a custom setting
const REQUIRED_SETTINGS = {
    // We chunk our pages into small algolia entries, and mark them as distinct by slug
    // This ensures we get one result per page, whichever is ranked highest
    distinct: true,
    attributeForDistinct: `slug`,
    // This ensures that chunks higher up on a page rank higher
    customRanking: [`desc(customRanking.heading)`, `asc(customRanking.position)`],
    // Defines the order algolia ranks various attributes in
    searchableAttributes: [`title`, `headings`, `html`, `url`, `tags.name`, `tags`, `authors.name`, `authors`],
    // Add slug to attributes we can filter by in order to find fragments to remove/delete
    attributesForFaceting: [`filterOnly(slug)`]
};

These hard-coded settings will overwrite any configuration work you've done in the Algolia dashboard. So if you set up your index and then load your posts, you're going to have to readjust your index... again.

In addition, it completely destroys any custom settings for facets. If you want faceted search, you need to update this code, or else remove the code from cli.js that causes the index to get its settings redone.

So you can fix it on your local machine for importing posts, but when you deploy to Netlify, then what? Netlify loads the packages listed in package.json. So what I needed to do to deploy an edited file to Netlify was to (1) copy this file out of node_modules and make my edits to it, (2) edit the post-published.js function in the algolia-netlify package to import this modified file instead.

If you don't want to load authors into Algolia, or you want to be able to search by post date, or anything else that requires differences in loading content, you'll also need to edit algolia-fragmenter/lib/transformer.js to load the desired content. And as above, you'll need to move transformer.js out of node_modules and import it directly.

Got content that's paywalled?

This article discusses modifications to use the Admin API instead of the Content API for loading Ghost content into Algolia.