June 21, 2024

Bella Huang | Software program Engineer, Residence Candidate Technology; Raymond Hsu | Engineer Supervisor, Residence Candidate Technology; Dylan Wang | Engineer Supervisor, Residence Relevance

Graphic: Reward the new engagement to its query in offline workflow to Query Pins (repins, clicks, closeups) to Homefeed Recommendations to User (New Recommendations are generated from queries) to Future engagements (future repins, clicks, closeups) with Feedback Loop arrow in the center of the flow map.

In Homefeed, ~30% of beneficial pins come from pin to pin-based retrieval. Which means that throughout the retrieval stage, we use a batch of question pins to name our retrieval system to generate pin suggestions. We sometimes use a consumer’s beforehand engaged pins, and a consumer could have a whole bunch (or hundreds!) of engaged pins, so a key drawback for us is: how will we choose the correct question pins from the consumer’s profile?

At Pinterest, we use PinnerSAGE as the principle supply of a consumer’s pin profile. PinnerSAGE generates clusters of the consumer’s engaged pins based mostly on the pin embedding by grouping close by pins collectively. Every cluster represents a sure use case of the consumer and permits for variety by choosing question pins from completely different clusters. We pattern the PinnerSAGE clusters because the supply of the queries.

Beforehand, we sampled the clusters based mostly on uncooked counts of actions within the cluster. Nonetheless, there are a number of drawbacks for this fundamental sampling method:

  • The question choice is comparatively static if no new engagements occur. The principle cause is that we solely contemplate the motion quantity after we pattern the clusters. Except the consumer takes a big variety of new actions, the sampling distribution stays roughly the identical.
  • No suggestions is used for the longer term question choice. Throughout every cluster sampling, we don’t contemplate the downstream engagements from the final request’s sampling outcomes. A consumer could have had constructive or adverse engagement on the earlier request, however don’t take that under consideration for his or her subsequent request.
  • It can’t differentiate between the identical motion varieties except for their timestamp. For instance, if the actions inside the identical cluster all occurred across the similar time, the burden of every motion would be the similar.
Graphic: Events arrow to Cluster Sampling (three clusters) arrow to Query Selection.
Determine 1. Earlier question choice move
Events arrow to Cluster Sampling. Arrow above from Query Reward to Cluster Sampling (three clusters). Arrow from Cluster Sampling to Query Selection.
Determine 2. Present question choice move with question reward

To deal with the shortcomings of the earlier method, we added a brand new part to the Question Choice layer referred to as Question Reward. Question Reward consists of a workflow that computes the engagement fee of every question, which we retailer and retrieve to be used in future question choice. Due to this fact, we will construct a suggestions loop to reward the queries with downstream engagement.

Right here’s an instance of how Question Reward works. Suppose a consumer has two PinnerSAGE clusters: one giant cluster associated to Recipes, and one small cluster associated to Furnishings. We initially present the consumer plenty of recipe pins, however the consumer doesn’t have interaction with them. Question Reward can seize that the Recipes cluster has many impressions however no future engagement. Due to this fact, the longer term reward, which is calculated by the engagement fee of the cluster, will regularly drop and we may have a larger likelihood to pick out the small Furnishings cluster. If we present the consumer a number of Furnishings pins they usually have interaction with them, Question Reward will enhance the chance that we choose the Furnishings cluster sooner or later. Due to this fact, with the assistance of Question Reward, we’re capable of construct a suggestions loop based mostly on customers’ engagement charges and higher choose the question for candidate technology.

Some clusters could not have any engagement (e.g. an empty Question Reward). This may very well be as a result of:

  • The cluster was engaged a very long time in the past so it didn’t have an opportunity to be chosen just lately
  • The cluster is a brand new use case for customers, so we don’t have a lot document within the reward

When clusters wouldn’t have any engagement, we are going to give them a mean weight in order that there’ll nonetheless be an opportunity for them to be uncovered to the customers. After the subsequent run of the Question Reward workflow, we are going to get extra details about the unexposed clusters and determine whether or not we are going to choose them subsequent time.

Graphic: Reward the new engagement to its query in offline workflow to Query Pins (repins, clicks, closeups) to Homefeed Recommendations to User (New Recommendations are generated from queries) to Future engagements (future repins, clicks, closeups) with Feedback Loop arrow in the center of the flow map.
Determine 3. Constructing a suggestions loop based mostly on Question Reward
  • Pinterest, as a platform to carry inspirations, wish to give Pinners customized suggestions as a lot as we will. Taking customers’ downstream suggestions like each constructive and adverse engagements is what we need to prioritize. Sooner or later iterations, we are going to contemplate extra engagement varieties slightly than repin to construct a consumer profile.
  • To be able to maximize the Pinterest utilization effectivity, as an alternative of constructing the offline Question Reward, we need to transfer to a realtime model to counterpoint the sign for profiling amongst on-line requests. This may permit the suggestions loop to be extra responsive and immediate, doubtlessly responding to a consumer in the identical Homefeed session as they browse.
  • In addition to the pin based mostly retrieval, we will simply undertake an analogous technique on any token-based retrieval technique.

Due to our collaborators who contributed via discussions, opinions, and solutions: Bowen Deng, Xinyuan Gui, Yitong Zhou, Neng Gu, Minzhe Zhou, Dafang He, Zhaohui Wu, Zhongxian Chen

To study extra about engineering at Pinterest, try the remainder of our Engineering Weblog, and go to our Pinterest Labs website. To discover life at Pinterest, go to our Careers web page.