VuePlanner Successfully Re-architects for Scale

This Elasticsearch re-architecture with Gigasearch is ultimately what fixed all of our performance issues. We haven’t had a problem since then.

VuePlanner provides a platform to help advertisers do contextual targeting on YouTube videos. It gives marketers the ability to find videos relevant to a search term quickly and see statistics on those videos (number of views, etc). Large agencies and brands use it to find safe, suitable, and high leverage YouTube videos to advertise on.

The following was an interview with Jeremy Stewart, SVP and Co-founder of VuePlanner.

VuePlanner can forecast the number of views on collections of YouTube videos

What is your Elasticsearch use case?

We index data from the YouTube API, and manipulate it in various ways within our platform to create the most curated advertiser experience on YouTube. This includes running proprietary ML based solutions to forecast the number of views a video or collection of videos will get over the next month. Our Elasticsearch pipeline consists of ingesting data from the YouTube API daily, scrubbing data, removing inappropriate content, and running forecasting models. Elasticsearch then serves that data to our front-end application.

What were your team’s challenges with Elasticsearch prior to working with Gigasearch?

We began to run into issues at scale, with over 1.5 billion documents in our Elasticsearch cluster. Various parts of our platform, from the ingestor to the forecaster, started breaking. Nodes on the cluster were under increased memory pressure, with prolonged GC times. Critical queries from the forecaster were timing out. It didn’t help that the engineer that set up the Elasticsearch cluster had left the company. We knew we needed to make drastic changes to reach the next level of scale, but didn’t want to make the changes blindly. We didn’t understand the tradeoffs with the various solutions we were considering. Until we stabilized our system, we really couldn’t move forward in terms of new development.

“Until we stabilized our system, we really couldn’t move forward in terms of new development.”

What was the business impact of these issues?

Advertisers can use our platform to build a collection of videos matching their advertising criteria and see how many views they could get if they advertised against them. During our sales calls, we would demo building and forecasting a collection live. When the forecaster started timing out due to the Elasticsearch issues, our ability to sell our platform to new clients was impacted. We’d have to take down the forecaster and modify our demos. The workaround we got from AWS involved using a blue/green deployment of Elasticsearch, which would fix the issue temporarily, but the performance issues would resurface after a while.

What made you choose Gigasearch?

When we started searching for help with this issue, we were really looking for experienced specialists that had seen this type of problem before. We wanted experts that would understand the problem at a deep level by reviewing our cluster, queries and code base, rather than offering generic solutions. It was also important that the experts are located in the US, and offered competitive pricing without an indefinite commitment.

We didn’t know where to begin in terms of fixing our issues and were prepared for feedback ranging from moving out of AWS Elasticsearch Service, rewriting our queries, or fully rearchitecting our platform. We had considered multiple options before starting the engagement with Gigasearch. I appreciated that Gigasearch provided actionable insights right from the free consultation. The free consultation made it clear that they would look at the problem from a bunch of different angles. We could tell right off the bat that Gigasearch was going to approach this problem as if it were theirs. It wasn’t just going to be a one-size-fits-all solution.

“We could tell right off the bat that Gigasearch was going to approach this problem as if it were theirs. It wasn’t just going to be a one-size-fits-all solution.”

AWS was essentially telling us to keep doing what we were doing. We knew we needed a change, but wanted an unbiased expert opinion.

How was your experience with Gigasearch?

From the beginning, it felt like Gigasearch was giving us more than we signed up for. We weren’t expecting the number of experts that got involved in the project. Various specialists were pulled in with specific skill sets where needed. It felt like we were getting a lot of value.

The communication was fantastic. There was a lot of transparent communication over shared Slack channels and Google docs. The project plan that we received was extremely thorough. Gigasearch delivered everything they said they would deliver. The experts that worked with us were available in minutes via Slack anytime we needed them.

“The communication was fantastic. There was a lot of transparent communication over shared Slack channels and shared Google docs.”

Gigasearch also set up monitoring for our Elasticsearch cluster, which was really helpful during the migration to ensure that key metrics were trending in the right direction. Our developer still uses this when monitoring Elasticsearch.

A nice side benefit of the project was that it made us document our platform more systematically than we had before, in order to onboard Gigasearch and answer questions as they came up. We still refer to this documentation we created.

What were the key results?

We were able to successfully upgrade to Elasticsearch 6.8 from 6.3 with guidance from Gigasearch. We were unsure what compatibility issues there were with that, and what the benefits would be. Gigasearch checked the breaking changes in the new version, and assessed compatibility with our software. They also provided a guide to upgrading with no downtime. The upgrade was quick and easy after the guidance from Gigasearch.

Gigasearch told us that the way we had set up our index was inefficient. We had 600-700 different indices partitioned by month and language. Gigasearch quickly identified this as an issue and helped us consolidate these indices in a way that is scalable. Though we were considering doing this already, they provided the confirmation we needed to go ahead with the changes. We didn’t want to re architect and create a bigger problem, and incur additional costs.

Gigasearch let us know what the new index template and mappings should look like, including several optimizations such as using doc_values, using appropriate field types, and usage of appropriate analyzers. This Elasticsearch re-architecture with Gigasearch is ultimately what fixed all of our performance issues. We haven’t had a problem since then. And we were able to do all of this within our existing cluster.

“This Elasticsearch re-architecture with Gigasearch is ultimately what fixed all of our performance issues. We haven’t had a problem since then.”

Our engineers used to spend 75% of their time troubleshooting Elasticsearch instead of working on the core product. They can now focus on product features. We don’t have to worry about our system’s stability or scalability as much any more. We’re back to full speed ahead in terms of development, which is moving our business forward.

Gigasearch is a team of Elasticsearch consultants and engineers with experience deploying petabyte-scale clusters. Contact us today!