April 15, 2024
Alex Deng
The Airbnb Tech Blog

10 min learn

Dec 22, 2023

KDD (Information and Knowledge Mining) is a flagship convention in information science analysis. Hosted yearly by a particular curiosity group of the Affiliation for Computing Equipment (ACM), it’s the place you’ll find out about a few of the most ground-breaking developments in information mining, information discovery, and large-scale information analytics.

Airbnb had a major presence at KDD 2023 with two papers accepted into the primary convention proceedings and 11 talks and displays. On this weblog put up, we’ll summarize our workforce’s contributions and share highlights from an thrilling week of analysis talks, workshops, panel discussions, and extra.

Though search rating is an issue that researchers have been engaged on for many years, there are nonetheless many nuances to discover. For instance, at Airbnb, company are usually looking out over a interval of days or even weeks, not minutes. And being a two-way market, there are components just like the potential for hosts to cancel the reserving that we’d prefer to account for in rating.

Optimizing Airbnb Search Journey with Multi-task Learning, our paper accepted at KDD 2023, presents Journey Ranker, a brand new multi-task deep studying mannequin. The core perception right here is that for this sort of long-term search activity, we wish to optimize for intermediate steps within the consumer journey.

The Journey Ranker base module assists company in reaching constructive milestones. There may be additionally a Twiddler module that assists company in avoiding adverse milestones. The modules work off a shared characteristic illustration of itemizing and visitor context, and their output scores are mixed.

Due to its modular design, Journey Ranker can be utilized every time there are constructive or adverse milestones to contemplate. We’ve applied it in numerous Airbnb search and different merchandise to drive enhancements in enterprise metrics.

We additionally co-presented a tutorial on Data-Centric AI (DCAI). DCAI is a fast-growing subject in deep studying, as a result of as mannequin design matures, innovation is being pushed by information. We shared DCAI finest practices and developments for creating coaching information, creating inference information, sustaining information, and creating benchmarks, with many examples from working with LLMs.

On-line experimentation (e.g., A/B testing) is a typical manner for organizations like Airbnb to make data-driven selections. However excessive variance is continuously a problem. For instance, it’s arduous to show {that a} change in our search UX will drive worth when bookings are rare and depend upon numerous interactions over an extended time period.

Our paper Variance Reduction Using In-Experiment Data: Efficient and Targeted Online Measurement for Sparse and Delayed Outcomes presents two new strategies for variance discount that rely solely on in-experiment information:

  1. A framework for a model-based main indicator metric that frequently estimates progress towards a delayed binary final result.
  2. A counterfactual therapy publicity index that quantifies the quantity a consumer is impacted by the therapy.

In testing, each strategies achieved a variance discount of fifty% or extra. These methods have enormously improved our experimentation effectivity and affect.

With greater than 50% variance discount, the brand new model-based main indicator metric (listing-view utility, on the correct) aligns with the goal uncancelled reserving metric significantly better than different indicators reminiscent of listing-view with dates (on the left).

One other attention-grabbing problem in on-line experimentation is avoiding interference bias, which may occur when you will have competitors between your A/B take a look at topics. Airbnb offered a keynote discuss on this matter at KDD’s 2nd Workshop on Decision Intelligence and Analytics for Online Marketplaces. For example, for those who ran an A/B take a look at the place group B noticed decrease reserving costs, they may “cannibalize” the bookings from group A. There are two imperfect options: clustering (isolating the choices for members) and switchbacks (grouping members by time intervals).

Additionally on the workshop, we offered the paper The Price is Right: Removing A/B Test Bias in a Marketplace of Expirable Goods. This discusses the issue of lead-day bias: the place objects like live performance tickets, air journey, and Airbnb bookings range in worth primarily based on the space from their expiration date. This may wreak havoc on A/B exams, and within the paper we current a number of mitigation methods, reminiscent of restricted rollout, sensible overlapping of experiments, and Heterogeneous Therapy Impact (HTE) remixed estimator to right for bias and speed up R&D course of.

Together with restricted rollout and sensible overlapping of experiments, HTE-remixed estimator can present sufficiently sturdy estimation of the long-term experiment affect from the short-term end result and considerably shorten the experiment run-time.

In advertising and marketing, the million-dollar query is how a lot must you spend per channel? This may be reframed as a causal inference drawback: what number of incremental conversions does every channel drive?

Once we have a look at advertising and marketing actions throughout Nielsen’s Designated Advertising and marketing Areas (DMAs) we discover average to sturdy correlation throughout channels. This makes it arduous to isolate the affect of 1 channel from one other. In truth, after we embrace the correlated channels in the identical regression, the coefficients flip indicators for many channels, a transparent signal of multicollinearity.

Current options to multicollinearity, reminiscent of shrinkage estimators, principal part evaluation, and partial linear regression, are significantly useful for prediction issues however work much less nicely for our use case the place we have to keep enterprise interpretability whereas isolating causality. Our strategy, described within the paper Hierarchical Clustering as a Novel Solution to Multicollinearity, is to hierarchically cluster DMAs primarily based on their similarity in advertising and marketing impressions over time. With such clustering, cross-channel correlation dropped by as much as 43% and the channel coefficients now not flip indicators.

Not solely does our methodology present an intuitive and efficient resolution to multicollinearity, it additionally circumvents the necessity for complicated transformation and preserves the interpretability of the info and the outcomes all through, empowering broad purposes to causal inference issues.

We offered this paper on the new KDD workshop, Causal Inference and Machine Learning in Practice: Use cases for Product, Brand, Policy, and beyond. Airbnb’s Totte Harinen co-organized this workshop, which strongly resonated with KDD’s viewers — it had 12 papers and 4 invited talks from 37 authors in 14 establishments.

As well as, we have been invited to current two talks and one poster at KDD’s 2nd Workshop on End-End Customer Journey Optimization, and joined the workshop’s panel dialogue. One among these talks lined CLV (buyer lifetime worth) modeling. At Airbnb, we wish to develop our model and group by rising all customers. Our CLV ecosystem applies two frameworks:

  1. The worth of Airbnb prospects. We use conventional ML approaches together with analysis into extra customer-lifecycle-focused architectures (i.e. HMMs). We increase this with demand-supply incrementality modeling to correctly account for visitor and host contributions to worth.
  2. The worth progress that Airbnb delivers to prospects. By accounting for long-term incremental results of reserving on Airbnb together with incremental contributions from advertising and marketing and attribution methods, we are able to measure incremental adjustments in CLV and optimize in direction of them.

Causal inference will also be utilized to look. On the CJ workshop, we offered our paper Low Inventory State: Identifying Under-Served Queries for Airbnb Search, which explored the issue of searches that return a low variety of outcomes. Whether or not or not that quantity is “too low” and can deter a visitor from reserving is dependent upon search parameters and intent to guide. For a given search question, we are able to use causal inference to find out the incremental impact of a further end result on the likelihood of reserving. Our mannequin outperforms non-causal strategies and may help with provide administration as nicely.

Lastly, our poster mentioned how we measure the results of nationwide TV promoting campaigns. We analyzed TV publicity information and demographic information with information on Airbnb onsite conduct utilizing a third-party identification graph. We have been capable of resolve disparate datasets to a novel identifier and mannequin particular person households.

We use propensity rating matching to estimate TV results, after which scale these estimates to a nationally-representative inhabitants. We leverage this information to offer tactical insights for advertising and marketing and perceive how lengthy TV results take to decay.

The plot above (from simulated research for illustration) reveals the outcomes of an evaluation for a TV marketing campaign from August — October. We will see that the TV marketing campaign was efficient at rising bookings for households that noticed an Airbnb TV advert and was simpler for one subgroup (crimson line) than the opposite subgroup.

How are you going to obtain science at scale in a medium-to-large engineering group? On the KDD’s 2nd Workshop on Applied Machine Learning Management, we shared Airbnb’s resolution for information science reproducibility and reuse, Onebrain. The core of Onebrain is a coding customary for configuring information science initiatives solely in YAML. Onebrain’s backend abstracts away CI/CD, configuration/dependency administration, and command-line parsing. Because it’s “simply code,” Onebrain initiatives may be checked right into a version-controlled repo, and any repo is usually a Onebrain repo.

Consumer interplay with Onebrain occurs by a CLI. With a single command, anybody can use an current undertaking as a template for their very own work, or generate a one-click URL to spin up a server and run the undertaking. Utilization is rising quick with over 200 distinct initiatives and over 500 customers at Airbnb inside only a yr.

Whereas most of our analysis focuses on high-order information use-cases like fashions, information seize is important because it’s the start line for any evaluation. Occasion logging libraries usually seize actions on and impressions of app parts (buttons, sections, pages). However with this stage of granularity, it may be troublesome to summary out consumer conduct, measure the entire time spent on a floor, or perceive the context surrounding an motion.

On the 2nd Workshop on End-End Customer Journey Optimization, we spoke a couple of new sort of client-side occasion referred to as Classes. A part of Airbnb’s client-side logging resolution, Classes present a approach to observe consumer context and behaviors inside the Airbnb product. In contrast to conventional time-based classes utilized in net analytics, these Classes may be tied to varied facets of the Airbnb consumer expertise. For instance, they are often tied to particular surfaces just like the checkout web page, API calls used for observability, and even inner states of the app that summary away complicated UI parts. The pliability of Classes permits us to seize a variety of consumer interactions and higher perceive their journey all through our platform.

KDD is a tremendous alternative for information scientists from world wide, and throughout trade and academia, to come back collectively and trade learnings and discoveries. We have been honored to be invited to share methods we’ve developed by utilized analysis at Airbnb. The methods and insights we offered at KDD have been important to bettering Airbnb’s platform, enterprise, and consumer expertise. We’re continuously motivated by improvements occurring round us, and we’re thrilled to present again to the group and desperate to see what varieties of recent purposes and developments might come about consequently.

On the backside of the web page, you’ll discover a full listing of the talks and papers shared on this article together with the workforce members who contributed. Should you can see your self on our workforce, we encourage you to use for an open position in the present day.

Optimizing Airbnb Search Journey with Multi-task Studying [link]

Authors: Chun How Tan, Austin Chan, Malay Haldar, Jie Tang, Xin Liu, Mustafa Abdool, Huiji Gao, Liwei He, Sanjeev Katariya

Variance Discount Utilizing In-Experiment Knowledge: Environment friendly and Focused On-line Measurement for Sparse and Delayed Outcomes [link]

Authors: Alex Deng, Michelle Du, Anna Matlin, Qing Zhang

Past the Easy A/B take a look at: Mitigating Interference Bias at Airbnb

Speaker: Ruben Lobel

The Value is Proper: Eradicating A/B Check Bias in a Market of Expirable Items [link]

Writer: Thu Le, Alex Deng

Unveiling the Visitor & Host Journey: Session-Based mostly Instrumentation on Airbnb Platform

Speaker: Shant Torosean

Dedicated to Lengthy-Time period Journey: Rising Airbnb By Measuring Buyer Lifetime Worth

Speaker: Sean O’Donell, Jason Cai, Linsha Chen

Low Stock State: Figuring out Below-Served Queries for Airbnb Search [link]

Writer: Toma Gulea, Bradley Turnbull

Measuring TV Campaigns at Airbnb

Speaker: Adam Maidman, Sam Barrows

Tutorial: Knowledge-Centric AI [link]

Presenter: Daochen Zha, Huiji Gao

Hierarchical Clustering As a Novel Answer to the Infamous: Multicollinearity Downside in Observational Causal Inference [link]

Authors: Yufei Wu, Zhiying Gu, Alex Deng, Jacob Zhu, Linsha Chen

Onebrain — Microprojects for Data Science [link]

Authors: Daniel Miller, Alex Deng, Narek Amirbekian, Navin Sivanandam, Rodolfo Carboni