So you’ve decided that Elasticsearch is the right technology for you. Congrats, you’re in good company! Now you’ll need to decide where you want to deploy Elasticsearch. You have a number of options here, including hosting it yourself, or choosing a hosted service such as Elastic Cloud or AWS Elasticsearch Service (or some other third party providers). In this post, we’ll explain the core differences between these options so that you can make an informed decision.
In case you’re in a hurry, here’s a matrix that summarizes the key differences:
Elastic Cloud is the cloud-hosted Elasticsearch (and the rest of the ELK stack) offered by Elastic, the company behind Elasticsearch. Elastic Cloud provides the simplest solution for getting your Elasticsearch cluster up and running with only a few clicks. When creating a new deployment, you can choose between AWS, GCP and Azure in your region of choice, making this a cloud-agnostic option.
Upon selecting the Cloud Provider, you can either select to create your deployment using the default settings, or customize the deployment. When customizing your deployment, you really only have two levers: "size per zone" and "Availability zones". The size per zone is a dropdown of predefined combinations of storage size, RAM, and CPU. This is probably meant to fool proof configuration, since the ratios between these three parameters can be hard to optimize. The issue though is that you have no direct control over the number or type of data node used. These options are available for hot, warm, cold and frozen data tiers (if you’re unfamiliar with what these data tiers provide, check out our blog on the Elasticsearch hot-warm architecture here).
Additionally, you can configure dedicated Master/Coordinating nodes, the number and size of your Kibana nodes, APM/Fleet nodes, Enterprise Search nodes, and machine learning nodes. A single node is provided free of charge for Kibana, APM, ML and Enterprise Search, though the size is fairly small for each. The interface provides a summary including the cost of your selections throughout the whole process, so it’s simple to play around with the sizing and number of nodes to see how it will affect the price.
Elastic Cloud does not currently support multi-region nodes within the same deployment (i.e. supporting node1 in AWS east-1 and node2 in AWS west-2), but this is a feature that has been discussed on their roadmap. If high availability is critically important, then this may be a deal breaker.
After creating the deployment, you simply wait a few minutes for the nodes to be configured and Elastic handles the rest. You are given URLs for your new Elasticsearch and Kibana endpoints which are now ready for you to configure and use as needed. Note that you do NOT have the ability to directly SSH onto the instances themselves; Kibana, the Elasticsearch API, the Elastic Cloud interface is all you have to interact with your instances.
There are four service tiers available in Elastic Cloud: Standard, Gold, Platinum and Enterprise. Choosing the right tier will generally depend on your use case and desired level of support. As a rule of thumb, you would want to choose the lowest (cheapest) tier that provides all of the functionality you require. For example, if you require 24/7 support with fast SLAs to reduce downtime, you would only want to consider the Platinum and Enterprise tiers as the other two do not provide the same level of support. Elastic provides a great breakdown of the services each tier provides here.
Elastic Cloud provides the most turn-key solution. Most operational concerns like setting up snapshots, Elasticsearch version upgrades, and configuring index lifecycle (via ILM) are all automated. This makes Elastic Cloud very attractive if you're just starting out. If you know what you're doing though, you might find that Elastic Cloud doesn't provide enough flexibility in configuring your cluster. For example, there is no way to set up 3 "hot" nodes with low CPU and RAM, but a large disk. Elastic Cloud also does NOT do the hard work of configuring your index mappings and optimizing your search queries and indexing workload. To be fair, none of these cloud providers can do that for you. Those are things that you'll need to configure yourself. Gigasearch can help with that, contact us to learn more!
Elastic Cloud is the most expensive option. The price can be over 3 times more expensive than the self-hosted option. Therefore, Elastic Cloud may only make sense if you are at a relatively low scale (less than 10 data nodes).
AWS Elasticsearch Service
AWS Elasticsearch Service (ESS) is the cloud-hosted version of Elasticsearch offered by AWS, the cloud arm of the tech giant Amazon. When creating an Elasticsearch domain, you are given a myriad of options. You can select the Elasticsearch version you wish to use, the number of nodes, type and amount of storage, etc. You have access to all the AWS security features you may be familiar with such as VPC access for your cluster, setting up IAM rules for security/access to Elasticsearch and Kibana (alternatively SAML, Cognito and basic auth are available as well), and access policies and encryption settings as well.
After waiting a few minutes for the domain creation, your cluster is ready to be used. You are redirected to a dashboard where you can adjust your domain configurations. Within the dashboard is a link you can select the link for Kibana. By opening the side panel, you can immediately see how ESS offers only a subset of what is available in Kibana on Elastic Cloud. If you'd like a more detailed guide on setting up an AWS Elasticsearch Service cluster, check out our guide here.
One major factor to consider when evaluating AWS ESS to host your Elasticsearch cluster(s) is the fact that it will no longer use the latest versions of Elasticsearch to power its services. In January 2021, Elastic announced that it would change its licensing strategy. This in turn forced AWS Elasticsearch Service to create a fork of Elasticsearch/Kibana 7.10.2, named OpenSearch. Though this will have little-to-no impact in the short term as both Elasticsearch and OpenSearch will be very closely aligned, this change can have a huge impact in the future as development progresses for the two products. Amazon has promised continued updates for its existing Elasticsearch services and ensures that it is prepared to maintain its new open source product themselves if necessary. Amazon OpenSearch Service is not currently available today, but it will be interesting to see how Amazon’s new product diverges from Elastic’s path. You can read more about OpenSearch here.
While AWS ESS automates some of the operational tasks related to managing Elasticsearch, such as creating automated snapshots. It also has a proxy to Elastic Cloud's Index Lifecycle Management, called Index State Management (ISM). This allows you to create policies for managing indices, such as automatically moving to warm storage. At the time of writing, there is no UI component to ISM, policies must be attached via an API call.
One big miss with AWS ESS is monitoring. It does not have access to the Kibana monitoring feature because that is an "x-pack" feature requiring an Elastic license. Therefore, you are stuck with the metrics available from the Elasticsearch API and Cloudwatch, which in our opinion is a terrible monitoring solution. Cloudwatch monitoring for Elasticsearch gives you just the basic high level metrics, with no out-of-the-box monitoring for lower level metrics such as GC times, cache sizes, and distribution of shards. These metrics can be critical to debugging poor performance of Elasticsearch clusters. Gigasearch offers a monitoring solution that works for AWS ESS deployments, you can learn about it here.
AWS ESS is about 50% more expensive than self-hosting, which makes it very competitively priced against Elastic Cloud. It provides most of the functionality of Elasticsearch, with the caveat that the features of OpenSearch will become increasing divergent from the Elasticsearch project. Customers already on AWS and at a medium level of scale (over 10 nodes) might want to choose AWS ESS.
AWS vs Elastic Cloud Feature Comparison
Here's a more direct comparison between AWS Elasticsearch Service and Elastic Cloud.
For the brave (or cheap) among you, there is another option: self-hosting Elasticsearch. This can be a daunting undertaking, but can pay off massively in cost savings if you’re able to pull it off. Your team will need a solid background in infrastructure automation and devops to self-host without risking data loss or severe outages.
First, you’ll need to provision the required infrastructure, either on the cloud of your choice, or in your data center. We recommend using a technology like Terraform to provision the nodes instead of doing it manually, if you’re on a public cloud. Then you’ll need to install Elasticsearch on each of your nodes. Careful consideration must be given to the type of hardware given to each node type. For example, master nodes can get away with much less RAM than data nodes, and “hot” nodes should have SSDs while “warm” nodes can be on spinning disk. You’ll want to use a configuration management system like Puppet or Saltstack to create and install the configuration files for each node type, since doing this manually at scale can be error prone and tedious.
At this point, you should have a working Elasticsearch cluster! But… you also need to set up things like automated snapshots, hot/warm rollover yourself, monitoring, and autoscaling yourself. Upgrades can be tedious since you’ll need to do a rolling upgrade of all nodes to avoid downtime. At high node counts (100+) nodes, you might also have nodes frequently going offline due to various issues like hardware degradation. Most of these concerns are abstracted away with a managed service. If you happen to have a Kubernetes cluster already, you could potentially use the powerful ECK which makes operating Elasticsearch at scale much easier.
If you decide to go with self-hosted, your reward is the most cost effective Elasticsearch cluster possible. It is over 3 times cheaper than Elastic Cloud and roughly half the cost of AWS ESS. You could also potentially save even more if you purchase the equivalent of reserved instances on AWS for your nodes. At very high scales (we’ve seen deployments of over 1000 nodes), the savings can be in the millions of dollars.
One very big caveat is that there are certain features of Elasticsearch that are limited to a paid license. For example, the machine learning and anomaly detection feature is only available with the Enterprise license. Pricing for these licenses is opaque, but could add significantly to your costs even with self-hosted. You can see the features that require a paid license here.
The choice between Elastic Cloud, AWS ESS, and self-hosted will ultimately come down to the idiosyncrasies of your organization. With Elastic Cloud, you get a simple, all-in-one solution with built-in support for ELK stack tools. It is easy to scale, allows for customization, always has the latest version available and has support included. For most customers just starting out with Elasticsearch, this is likely the best option.
With AWS ESS, you get a more cost-effective solution that allows you to add additional tools on an as-needed basis. If you need a ML solution, spin up an Amazon Lex solution in AWS and connect it with your cluster. Granted, you can do this with Elastic Cloud as well, but it becomes a much simpler solution to wire everything together if your applications are already hosted on AWS. Organizations that are using Elasticsearch at a decent level of scale and are already on AWS may opt for AWS ESS.
Finally, the self-hosted option is the cheapest, but at the cost of a big commitment in devops resources. Organizations using Elasticsearch at scale and with a dedicated devops team will be most attracted to this option due to the huge cost savings.
Hopefully this guide has helped you create a better mental model for the various flavors of Elasticsearch deployments, so that you can make an informed decision before pulling the trigger on any one provider. At Gigasearch, we are a team of Elasticsearch consultants helping companies scale Elasticsearch on any provider. Contact us to learn more!