Semantic Search with ELSER

This blog post explains how to use ELSER, a tool that makes Elasticsearch searches smarter by understanding what users actually mean. We'll walk you through setting up ELSER, adding it to your search system, and combining it with regular keyword searches for the best results.

Semantic Search with ELSER
Photo by Sebastian Knoll / Unsplash

Elasticsearch is a powerful search engine and has been used for over a decade to deliver fast, scalable search solutions. Until recent years, this has been done primarily using lexical, or keyword-based search. Lexical search provides for excellent searching capabilities at blazing fast speeds – That is, so long as the keywords used to search are found within the document set. The problem with lexical search lies in the fact that how we as humans use language is not one size fits all. There are multiple ways to say something that has the same underlying meaning. And as such, there are multiple ways to search for things that have the same or similar meaning.

Imagine searching for "exercise clothes" and getting results for all clothes and not just those for exercising. Or imagine more generic searches for clothing. One could search for “t-shirt”, “shirt”, “red shirt”, “short-sleeved shirt”, “white tee” or many other variations. All of these searches have the same or similar meaning. With lexical search alone, a search for “red shirt” could easily bring up red pants as a top result because of a match on the term ‘red’. The search for “short-sleeved shirt” could easily bring up long sleeved shirts because of a match on ‘shirt’ and the fact that your documents do not mention the term ‘short-sleeved’ within the data. The search for “white tee” could bring up a golf tee due to the match on the term ‘tee’. This is where semantic search and by extension, ELSER, come into play. By capturing not only the text, but the intent and meaning behind it, semantic search models like ELSER can help provide more relevant search results.

ELSER, or Elastic Learned Sparse EncodeR, is a Natural Language Processing (NLP) model that elevates your Elasticsearch searches by understanding the intent and meaning behind your queries. Instead of relying on exact keyword matches, ELSER uses semantic search to deliver highly relevant results based on the context of your search terms.

Using ELSER

Installation and Deployment

The use of ELSER and other ML models requires specialized nodes in Elasticsearch called ML nodes. It takes specialized processing to handle generating embeddings and other ML tasks, therefore these tasks are delegated to separate nodes to ensure your data nodes are not burdened by this additional processing. Use of the features provided by ML are typically considered to be premium features and require the Elasticsearch Platinum license or better. Be sure to check the Elasticsearch subscriptions page to ensure you have the appropriate level to use these features.

If you do not already have machine learning nodes configured, check out this guide to first set them up. You will need ML nodes configured in order to take advantage of ELSER and other semantic search models in Elasticsearch. For advanced configuration of these nodes, check out the available options here. Once the ML nodes are up and running we’re ready to begin with the installation of ELSER!

ELSER first needs to be installed in your cluster before it can be used. Elastic has posted an installation guide here with steps on how this can be done. When configuring, don’t worry too much about the number of allocations or threads per allocation. These settings can be configured later and will need to be properly tuned before being used in Production, but the defaults are typically fine to start out. Once installed you’re ready to move on to the next step, data ingestion!

Data Ingestion

You can find full instructions complete with examples on configuring ELSER for data ingestion here, but I’ll give a short overview below.

The first step is setting up an ingest pipeline. ELSER is invoked from the ingest pipeline at index time to generate the embeddings. If you prefer using the Kibana interface to set up the pipeline, try this guide instead. It outlines how you can add an inference pipeline to a trained model from within the machine learning interface used during installation.

The next step is to create an index and set up the mappings. A new field type has been created to support this called the sparse_vector type. The guide above gives a great example on how to set this up. If you want your index to use this ingest pipeline by default, be sure to configure it as the default pipeline for your index. Once your index is set up, you're finally ready to begin ingesting data and generating embeddings using ELSER.

One thing to note about ELSER is that it has a limit of 512 extracted tokens. This means if the field that you’ve configured for ELSER has more text than this limit, the trailing text that exceeds the limit will be cut off and embeddings will not be generated for that text. Because you will be missing chunks of data, your relevancy will suffer as a result. For an initial test, I suggest sticking to content that does not surpass this limit. Just note you may need to develop a chunking strategy at some point to overcome this limiting factor. Elastic has posted some great material on how this can be done here and here if you would like to research the topic further.

Alternative Approach
As noted above, there are a lot of things to consider when ingesting data for semantic search. Elasticsearch is attempting to simplify this process by abstracting concepts such as the ELSER token limit and chunking strategies. Using the new semantic_text field type, Elasticsearch will automatically allow you to surpass ELSERs limitation and avoid having to come up with a chunking method yourself. Elasticsearch will automatically chunk your data and store the chunks in a nested object structure.At query time you simply select your semantic_text field and Elasticsearch handles the rest. If you prefer a simpler approach and your use case does not require more fine-tuned controls, I recommend trying this field type instead.

At this point, you can now begin indexing data into your index. You can start fresh by adding new data or reindex data from existing indices. If re-using existing indices, you may need to update the index mappings to use the copy_to parameter to copy content from existing fields to your new sparse vector field. This is where the number of allocations and threads per allocation come into play. If you find you are getting errors during ingestion relating to your ML nodes not being able to handle the load, try adjusting these settings to better handle the load. A useful guide for finding what values to use can be found here.

Now that you have your data loaded, it’s time to move on to search! Before that, however, take a minute to review what you’ve just accomplished. If you now review the data in your index, you can see your newly configured sparse vector field contains a list of key value pairs. This may looks something like the following:

"tokens": {
              "shirt": 0.901667,
              "size": 0.08836714,
              "inches": 0.32947156,
              "white": 0.053833894,
              "length": 0.22414298,
              "short": 1.2218319,
              ...
}

ELSER has taken the text and generated numeric values to represent them. When performing semantic search it is these numeric values that are evaluated for similarity as opposed to the text itself. 

When searching using ELSER, your input text will need to be converted to the same vector format that was indexed during your ingestion phase. To do this, Elasticsearch uses the text_expansion query which specified the ‘model_id’ and ‘model_text’. The ‘model_id’ lets Elasticsearch know which model (in our case ELSER) to use to generate the embeddings and the ‘model_text’ is simply your query string. To test your semantic search, simply use this query syntax for your new sparse vector field and perform a search as you normally would.

Upon your evaluations, you may find that semantic search using ELSER performs well in most cases, but there are some cases where your more traditional searches provided more relevant results. This is where hybrid search comes into play. Hybrid search in Elasticsearch is the combination of two or more different search methods such as lexical search and sparse vector search. By combining the various search methods into a single query you can leverage the strengths of each to provide more accurate and relevant results. Elastic recommends combining both Sparse (e.g. ELSER) and Dense vector searches with lexical search and optionally RRF (Reciprocal Rank Fusion).

Hybrid search is a topic too large to cover here, but if you are interested, check out these blog posts covering the topic: post1, post2. Additionally, I would recommend checking out the new retriever top-level search API available in Elasticsearch. It allows for a simple to use API to combine the different search types into a single, powerful query.

Conclusion

I hope you enjoyed this quick guide on ELSER, Elastic’s out-of-domain sparse vector model created to power Semantic search in Elasticsearch. Using ELSER can improve the relevancy of your results and provide your customers with a more engaging search experience.

If you need assistance setting up and configuring ELSER to provide more relevant search results, don’t hesitate to reach out to us here at Gigasearch. Our engineers specialize in Elasticsearch and have a passion for tuning relevancy to improve results.