Steps to Configure Typesense for Inverted Indexing

Steps-to-Configure-Typesense-for-Inverted-Indexing

Typesense, has inverted indexing built into the core of the search engine, meaning it is automatically configured when you set up your Typesense schema. However, there are certain configurations you can make to optimize how Typesense handles inverted indexes based on your specific use case.

Let us learn how to optimize it with an example:

Let’s consider a real-world example of configuring Typesense for inverted indexing in an e-commerce application that handles a large inventory of products. The goal is to ensure that users can quickly and accurately search for products by name, description, and category tags.

Scenario: An E-commerce Store

Imagine you’re building a search feature for an e-commerce store called “ShopFast.” The store has thousands of products, each with attributes like name, description, category tags, and price. You want users to be able to search for products by typing in keywords related to the product name, its description, or specific tags like “electronics,” “home appliances,” etc.

Step 1: Defining the Schema

You first need to define a schema in Typesense to determine how your product data will be indexed. The schema specifies which fields are searchable and how Typesense should handle the text in these fields.

json

{
"name": "products", (It means this is a product Schema)
"fields": [ (These are the fields that are going to define Product Schema)
{
"name": "name",
"type": "string",
"facet": false,
"index": true (“Name” field has been marked to be INDEXED by adding the value “index”: true. It means that the field Name (Product name) will be included in the invested index. So we will be able to search by Product Name)
},
{
"name": "description",
"type": "string",
"facet": false,
"index": true (It means Product Description will be included in invested index and the search will happen on this field as well)
},
{
"name": "tags",
"type": "string[]", (This is who we define that Tags is an array example: “electronics”, “home appliances” etc )
"facet": true, (Since Facet is True for Tags, it means that the filtering will be allowed on Tags)
"index": true (It means tags will be searchable)
},
{
"name": "price",
"type": "float",
"facet": true, (It is faceted so it means that filtering on the basis of price is allowed)
"index": false (Since it is not indexed, it means that the searching on the basis of Price is NOT allowed)
},
{
"name": "created_at",
"type": "int64",
"facet": false,
"index": false
}
],
"default_sorting_field": "price" (Products will be sorted by price by default, but users can adjust sorting criteria as needed.)
}

Step 2: Ingesting Data

Now, let’s say you have a product catalog that includes items like smartphones, refrigerators, and laptops. Here’s an example of how you might ingest a product into Typesense:

json

{

“name”: “Samsung Galaxy S21”,

“description”: “Latest Samsung smartphone with 128GB storage, 5G connectivity, and a stunning display.”,

“tags”: [“electronics”, “smartphones”, “samsung”],

“price”: 799.99,

“created_at”: 1622548800

}

When this data is ingested into Typesense, the name, description, and tags fields are automatically processed into an inverted index. For example:

  • Index Entry for “Samsung Galaxy S21”:
    • “Samsung” → Points to this product document.
    • “Galaxy” → Points to this product document.
    • “smartphones” → Points to this product document, and potentially other smartphone products.

Step 3: Handling Search Queries

When a user searches for “Samsung smartphone,” Typesense will look up both “Samsung” and “smartphone” in the inverted index and return all relevant products, with “Samsung Galaxy S21” being one of them.

You might send a search query to Typesense like this:

json

{

“q”: “Samsung smartphone”, (This denotes the Query term entered by the user)

“query_by”: “name, description, tags”, (This defines the fields on which the search should happen. In our example it is: Name, Description, and Tags. Note that these are the fields that have “index”: true)

“sort_by”: “price:asc”

}

Step 4: Optimizing for Search Performance

Further enhancement of the search experience can be done by configuring how Typesense tokenizes the name and description fields. For instance, enabling forward tokenization for the name field can improve autocomplete functionality.

json

{

“name”: “name”,

“type”: “string”,

“index”: true,

“tokenization”: “forward”

}

This configuration would allow users to start typing “Sam” and immediately see suggestions for “Samsung Galaxy S21” and other Samsung products.

Outcome

With this setup, your e-commerce store’s search feature is optimized for quick, relevant, and user-friendly searches. Typesense’s built-in inverted indexing ensures that queries are handled efficiently, providing users with fast and accurate search results.

Summary

In this real-world example, you configured Typesense to index key product fields using its inverted indexing capabilities. By carefully defining the schema, ingesting data, and optimizing search queries, you enabled a responsive and effective search experience for users in your e-commerce store.