Problem Definition

Optimization Objective

Optimization Objective is set to focus on bookings(or conversion).

prioritizing listings that are appealing to the guest but at the same time demoting listings that would likely reject the guest

Optimization Strategy

To achieve the above objective, there are 2 stages for the optimization strategy:

First stage (before 2020): treat every listing independently and assume that the booking probability of a listing could be determined independently of other listings in search results.
Second stage (after 2020): Optimize on all listings, treat the all listings as one; try to diverse the result

Key Metrics

Offline evaluation

NDCG (Normalized Discount Cumulative Gain)

Online test

Main metrics:

booking
revenue

Other metrics such as engagement metrics: listing viewed; listing saved

Structure of Solutions

Classic two steps recommendation system:

Step 1: Retrieve (or candidate generation):

location based queries (e.g. New York): Rule based: fetch everything near by
broad queries (e.g. Skiing in France): Model based

Step 2: Ranking of listings:

Model based: query, user and listing features
Trained using user actions: clicks, wishlists, inquires, bookings and rejections
Tradeoffs between Guest and Host preferences

two steps recommendation system

Problems, Solutions & Impacts

The table below shows the detailed evolution process. The following steps were mentions to have brought significant booking gains:

Replacing the manual scoring function with a gradient boosted decision tree (GBDT) model in 2015
Replacing GBDT regression model with GBDT ranker model in 2017
In-session recommendation with hotel embedding in 2018
Replacing GBDT ranker with deep learning in 2020

‣

Evolution Process Details

ML Solutions in Details

Overview - How do they iterate?

From Manual Scores to GBDT Model (2015 - 2017)

Assign labels to different feedbacks from search session

booking: 1
click: 0.01
impression: 0
reject: -0.4

Two ways of training: point-wise and pair-wise training

‣

Feature used

GBDT Regression Model

Point-wise training
Loss: RMSE

GBDT Ranker

Pair-wise training
Within the same search session, generate listing pairs
Loss: cross entropy
In order to pushing good outcomes to the top, and at the same time, push bad

outcomes to the bottom.

‣

There are some modifications.

From GBDT to Deep Learning Model (2018)

Failed Models

Listing ID as sparse feature of the deep learning model
Multi-task learning model to predict both booking and views

Hotel Embedding & Query Embedding

In Session Recommendation with Hotel Embedding

Steps

Offline training: Based on users search session and skip-gram(word2vector), get listing embedding
Online in-session recommendation: memorize listings that users liked or clicked, then recommend similar items based these info

Special modifications

Better negative-sampling: instead of sampling globally, sample negative listings from the same city
More weight to booked listings: in the booking sessions, the booked listing will always be the context listing
Cold start of new listings:

We use the provided meta-data about the listing to find 3 geographically closest listings (within a 10 miles radius) that have embeddings, are of same listing type as the new listing (e.g. Private Room) and belong to the same price bucket as the new listing (e.g. $20 − $25 per night).
Next, we calculate the mean vector using 3 embeddings of identified listings to form the new listing embedding.

Query Embedding

Steps

Offline training: Based on users' search queries sequence and skip-gram(word2vector), get query embedding
Online free search: autocomplete users' free search terms

Example:

‣

Users’ search query sequence

‣

Example of free search

Simple NN

Simple single hidden layer NN with 32 fully connected ReLU

activations

Features: Same features as GBDT
Loss: L2 regresson loss
Target:

1 -> booking
0 -> not booking

Performance: -> neutral bookings as GBDT

Lambdarank NN

Pair-wise NN
Loss: cross entropy
Target: same as previous model

NN model with features from GBDT and FM

The production model is NN.

From the GBDT model, we took the index of the leaf node activated per tree as a categorical feature.
A factorization machine (FM) model that predicted the booking probability of a listing given a query, by mapping listings and queries to a 32-dimensional space.

‣

Model structure:

Deep NN

What did they do?

scaling the training data 10x
moving to a DNN with 2 hidden layers.

Some special features: Features output from another model:

Price of listings that have the Smart Pricing feature enabled,

supplied by a specialized model

Similarity of the listing to the past views of the user, computed based on co-view embeddings (hotel embdding)

Improve Deep Learning Model (2020)

Two Tower Architecture

Cold Start for new Listings

Main problem:

the absence of user generated engagement features like number of bookings, clicks, reviews etc.

Solution

Predict engagement features
Feed these features to NN model

Position Bias

Solution

Introduce position as a feature, and this feature will be set to 0 while doing inference
To reduce model’s dependence to this feature, this feature is regularized by dropout while training. Dropout rate is set to be 0.15 based on offline evaluation

‣

Offline evaluation to get the dropout rate

Optimize on All Listings, Improve Diversity (2023)

Model is still two-towel structure DNN
Instead of building a single model, build models for each position

Position 0: reuse the regular pairwise model
From position 1 to N: model has addition input which is the listing from position 0 to position N-1

‣

Algorithm explain

Tricks and Lessons Learned from Airbnb

While modelling, consider the specific property of their platform

two-sided marketplace, they need to meet the needs of guests and hosts

Data label: -0.4 for rejected listing

a listing can only be booked by one guest at the same time

If including listing id as feature, it might cause overfit

user never book the same listing for twice

in-session recommendation

Training Data

‣

Early attribution of search session (2017)

‣

training data construction for model in position N (2023)

How to iterate

To improve our chances of success, we abandoned the {download paper 7→ implement 7→ A/B test} loop. Instead we decided to drive the process based on a very simple principle: users lead, model follows.

Next Step

Not all of the topics are covered in this summary. For example

AirBnb category

More work related to diversity

Managing Diversity in Airbnb Search (2020)

Search & recommendation from Booking.com, Agoda ….

‣

Search & Recommendation at Airbnb (2015-2023)