Benchmarking Search Performance: Elasticsearch vs competitors

We decided to benchmark performance of Elasticsearch competitors to find out how they stack against the Elasticsearch - both by feature set and performance.

Arkadii Chumachenko

11 Aug 2021 • 3 min read

There are some new exciting players in the open-source search engines field. We decided to look at some of them closely to find out how they stack against the Elasticsearch - both by feature set and performance.

Candidates:

Elasticsearch - mature text search engine, based on Lucene
RediSearch - full-text solution on top of Redis, built by RedisLabs
Postgres FTS - full-text indices for Postgres
TypeSense - open-source Algolia alternative
MeiliSearch - open-source Algolia alternative

Features

Feature	Elasticsearch	RediSearch	PostgreSQL	TypeSense	MeiliSearch
Storage	Disk	RAM + snapshots	Disk	RAM + snapshots	RAM + snapshots
Distributed	Primary/replica	RAFT-based	Primary/replica	RAFT-based	NO
Replicated	+	NO	+	NO	NO
Languages	latin + cjk + cyrillic + arabic, armenian, basque, bengali, brazilian, greek, hindi, indonesian, persian, sorani, thai	latin + arabic, russian, chinese	latin + arabic	all whitespace-separated	all whitespace-separated + kanji
Typo Tolerance	yes, might get slow	+	NO	+	+
Boosting	+	+	+	+	NO
Exact Search	+	+	+	NO	NO
Synonyms	+	+	+	+	+

Known limitations

Elasticsearch

Becomes unstable above ~1000 indices (or 20k shards) per cluster

TypeSense

Storage size limited by available RAM

Source: https://typesense.org/typesense-vs-algolia-vs-elasticsearch-vs-meilisearch/

Meilisearch

The maximum number of terms taken into account for each search query is 10
Maximum database size is 100GiB (can be changed per instance)
Up to 200 indexes
Maximum of 1000 words per field

Source: https://docs.meilisearch.com/reference/features/known_limitations.html#design-limitations

Benchmark

Dataset

Name: enwiki-20210720-abstract.xml
Description and Source: Date: July 20, 2021
Docs: 6.3M
XML size: 6.0 GB

Query words are chosen randomly from the 1000 most popular English words dataset.

Environment

2x General Purpose / 32 GB / 8 vCPUs DigitalOcean droplets (one for load generation + one for storage).

Results

Indexing time

For indexing we only counted the time our indexer spent in requests to the search backend. Elasticsearch, PostgreSQL and Typesense show very similar performance here, while RediSearch is ~2x slower; this result strangely contradicts the RedisLabs benchmark results so the set up might be suboptimal here. On the other hand, Meilisearch really shines here being almost 7 times faster than the others.

Query latency

Again, RediSearch is a slower outlier here for all queries, and again RedisLabs got different results. Another surprising outlier is the "three-word" query on Typesense, taking enormous amount of time on average for some reason. Meilisearch displayed pretty solid performance, especially for prefix and typo queries.

We also used zeroes for unsupported types of queries but RediSearch got its timings into the under 1 ms (!) zone for "exact phrase" and "three word AND" queries.

Raw numbers

Benchmark	Elasticsearch	RediSearch	PostgreSQL	TypeSense	MeiliSearch
Indexing
- time	268	516	290	272	42 (async)
- throughput	23644	12267	21827	23258	150284
1 Word Query	16.14	16.81	69.89	16.04	6.73
3 Word Query	4.07	0.95	2.61	224.36	11.57
OR Query	20.69	45.86	2.48	N/A	N/A
Exact Phrase Query	3.16	0.64	9.85	N/A	N/A
1 Word Prefix Query	7.76	36.98	9.22	6.75	6.18
Typo Query	19.81	58.17	N/A	14.61	5.84

Takeaways

Elasticsearch is still the king, offering solid performance for indexing and all types of queries.
RediSearch has so-so indexing performance and RedisLabs try hard to upsell their cloud solution so documentation is subpar too but it can give sub-millisecond latency for some types of queries.
PostgreSQL has a weird spike for simple one-word query performance and interface is quite complex though it might be a decent solution if you already have a Postgres database.
TypeSense has a good feature set and performance generally but with a strange spike at multi-word queries.
MeiliSearch seemingly great performance was caused by a test error, and we weren't able to complete the test with a proper set up.

Update: Meilisearch and Typesense results

Jason Bosco from Typesense reached out to us regarding the weird slow outlier results with 3-word queries and recommended to re-run that test with parameter drop_tokens_threshold=1 though the results are similar (200+ ms). We've also tried drop_tokens_threshold=0 effectively turning it into OR search with way better performance.

So the slow down is probably caused by the fact that we're picking 3 random English words for the query and there is no documents containing all three so Typesense starts dropping words unless it gets something, and this process is not very fast.

Jason also noted that seemingly fast Meilisearch indexing was actually caused by the index requests being asynchronous. We've updated the test to wait for all indexing tasks to complete but they're taking extremely long time so we'll need to look closer into how Meilisearch works under the hood.

Gigasearch is a team of Elasticsearch consultants and engineers with experience deploying and tuning petabyte-scale clusters. Contact us today!