The Pinterest recommendation system – lessons learned

min read

The Pinterest case study at a glance

In 2021 Pinterest grew its revenues by over 125% YoY, reached more than 450 million monthly active users (MAUs) and achieved 89% YoY growth of average revenue per user (ARPU).

Pinterest implies that these metrics are driven by the “Pinner experience”. According to Pinterest, user engagement (clicking, saving or linking pins) contributes to the “Pinner experience” the most.

Pinterest scaled and optimized it’s state-of-the-art vector-based recommendation system. We introduce some of the most important aspects of it.

Based on A/B experiments, the Pinterest vector-based recommendation system accounts for more than 80 % of all engagement on Pinterest.

Why does the Pinterest case study matter?

The purpose of this case study is to provide an insider perspective to decision makers about the benefits of a vector-based recommendation system. Through the use case of Pinterest, a leading digital content discovery and sharing platform, we argue that vector-based systems are the state-of-the art solutions to provide recommendation. Be it a generic product recommendation engine, a book recommendation engine, or e-commerce recommendation engine that you set out to build.

It has profound implications on the top line metrics of the business. Today, this technology is available for any organisation, independent of its size and wether internal machine learning capabilities are available.

Once Pinterest deployed its recommendation system called Pixie, they reached up to 50% higher “per pin user engagement” compared to a previously used Hadoop-based production system, by recommending more relevant content to users. With the newest iteration of the recommendation system called PinnerSage increasing volume of the Pinterest’s Homefeed and Shopping products by up to 20%.

Pinterest, the biggest platform for generating ideas

Pinterest’s goal is to serve Pinners (the Pinterest users) relevant ideas in real-time. Users view and save „pins” (images, gifs, and short videos) to „boards”. They name their boards creatively based on the collection of pins that they save together, doing image clustering.

In the following we use data to argue that these “basic Pinterest functionalities” although important, are not the main reasons why users use the platform.

Then what is?

The number one, most “needle-moving”, Pinterest functionality is the vector-based recommendation engine. We show that the main reason why Pinterest reached more than 450 million monthly active users in 2021 is that people primarily use Pinterest to get recommendations for things they love but may not have even known existed.

The Pinterest secret sauce: The vector-based recommendation system

Pinterest has grown its revenues by over 125% from 2020 to 2021, to $613 million. Pinterest’s top-line revenue growth can be attributed to the 9% YoY increase in the number of monthly active users (MAUs) as well as the 89% YoY growth in average revenue per user (ARPU).

As implied in the Pinterest Q2 2021 Letter to Shareholders, these factors were driven by the single most important Pinterest metric, the user engagement, that we define as “clicking, saving or linking pins”. According to Pinterest, it’s user engagement that defines the overall “pinner experience” in the first place.

According to a research done by Pinterest, its recommendation system accounts for more than 80 % of all engagement on Pinterest. It means that the Pinterest recommendation system is the single most important core functionality of Pinterest that essentially defines the success of more than 450 million users.

Users save the same pins to tens of thousands of different boards. This manual source of classification and labeling from hundreds of millions of users is essentially an image clustering. This image clustering provides great input information to the Pinterest recommendation system.

Challenges that Pinterest set out to solve

How to narrow down the number of relevant pins and surface them to the right person at the right time?

This is the most common challenge of any recommendation system. This is why recommendation systems exist.

Their main job to be done is to provide highly relevant recommendations that are personalised to each user based on the profile and real-time engagement of the users.

As of today, the state-of-the-art solution for these challenges is the deployment of a vector-based recommendation system. Vectors effectively map the meaning of pins into a geometric space. Pin vectors that are close to each other in the geometric space are relevant to each other because their meaning is actually similar.

For example, “spicy tofu with coconut sauce” is recommended to me when I click on “easy thai shrimp soup”. Thai cuisine is very closely related to tofu, coconut sauce and spicy spice therefore it makes a great recommendation. Chances are that I’ll be interested in the thai tofu recipe if I am interested in the recipe of thai shrimp soup.

We wrote about the in-depth technical details of the Pinterest recommendation engine in our blog post on the Relevance AI website. Check it out if you want to have a deeper understanding of a vector-based system at scale.

How to scale the recommendation system architecture to hundreds of millions of users and billions of pins?

Indeed, once we solve the previous challenge creating a working vector-based recommendation engine, due to the growing number of users and data on the platform we need to scale and optimize our solution. This challenge occurs when we set out to create and scale any recommendation architecture, be it a music recommendation system, ecommerce recommendation system or a podcast recommendation engine.

If we stick to the approach used for smaller recommendation systems, the required computational power and hence our costs associated with it would increase dramatically. Operating the system would not be economically feasible anymore.

Therefore we need to apply optimization methods to our solution. Pixie is a great example of a recommendation system that is highly optimized for scale. We read all relevant research papers to create our custom solution replicating this system. We published some of our findings in our blog post on the Relevance AI website.

How to make personalized recommendations in milliseconds on such scale?

When hundreds of millions of users perform individual searches simultaneously, these searches will result in billions of vectors that we need to store and operate on in order to provide recommendations to users.

We can easily see that even if one search takes less than a millisecond, millions of searches might take too much time that users would not tolerate and churn. For this reason, Pinterest’s previous Hadoop-based system pre-computed recommendations one day before they were surfaced to users. Recommendations were fast, however not based on the real-time engagement and preferences of the users.

As a solution, Pinterest had to develop their own, customized solution for vectors. Now Pinterest is able to provide recommendations in real time that results in up to 50% increased per pin engagement.

Quantified benefits of the recommendation system

Pinterest conducted A/B experiments across different Pinterest user surfaces in order to measure the increase in engagement with the vector based recommendations vs the previous recommendation engine. The results in “per pin engagement” are:

  • Homefeed: +48%
  • Related pins: +13%
  • Board recommendations: +26%
  • Explore tab: +20%
The Pinterest recommendation system – lessons learned
Benedek Zajkas
October 26, 2021