Embedding in Recommender - Part 3: Practices from the Hospitality Sector

Embedding in Recommender - Part 3: Practices from the Hospitality Sector

Tags
Data Science
Date Published
April 21, 2024
Blogs from the series of Embedding in Recommender

Introduction

As shown in the first blog of this series, embedding technique such as Item2Vec is borrowed from natural language processing. To apply these techniques in real-world scenario, adapting them based on business contexts is crucial.

In this blog, we’ll mainly take hospitality sector as an example to show these innovative adaptions and practice.

Several Business Contexts

First, let’s learn a few business contexts about hospitality sector. And most the changes to embedding technique are related to these business contexts

1. Unlike the words in NLP, there are extra meta-data for hotels

For example

  1. hotel attributes: porperty type, star rating, average user rating ..
  2. Amenity information: free breakfast, free wi-fi …
  3. Geographic information

So why not also encode these information into the embedding?

2. Cold-start problem for hotels

First, there are always new hotels. Second, as mentioned in part 1, user click sequence is usually used to get item embedding. However, there are always hotels that are never clicked by users or clicked only a few time.

Then how to get a good embedding for these cold-start hotels?

3. Users usually search for properties from the same market

When a user is searching properties from London, of course, property from Amsteram shouldn’t be shown.

So how does this info influence Item2Vec technique and how to utilize this info in Item2Vec?

4. Less user historical data compared to e-commerce

Due to the nature of the travel business, where users travel 1-2 times per year on average, bookings are a sparse signal, with a long tail of users with a single booking. It means there are less click or booking data related to a specific user.

Then with so few data, how can we learn good enough embedding?

5. Changing user preference which is hard to capture

Users’ preference changes, which is also related to previous point.

  1. In current seach session, user might be searching a hotel for business travel or famility travel, but all the data you have is this user’s last booking which happend 6 months ago. How do you know this user’s short-term preference?
  2. On the other side, users have long-term preference, for example, some users prefer relative cheap properties and they will always search for this kinds of properties. However, there are only one booking record from this user until now, how to detect this long-term preference?

6. Platform-specific contexts

For example,

  1. Expedia: in Expedia group, there are several brands such hotel.com and Expedia, which focus on different user profile. If we can use data from all of these brands, we might get better embedding. Then how can we learn a common embedding and, at the same time, the common embedding should degrade the performance in any of the brands?
  2. Airbnb:
    1. There are same room types from same hotel, and a hotel can be booked serveral time for a specific date. However, at Airbnb, there is no identical listings, and every listing can be only booked once for a day
    2. Unlike normal hotel platform, you can be rejected by a host, which is actually an important signal of host preference. How to include this signal in embedding?

Hotel Embedding Based on Item2Vec

Item2Vec is one of most extensive adopted technique in industry, and based on that, we can get hotel embeddings.

Here is a quick recap about Item2Vec:

  1. Optimization goal: learn item similarty; the embeding vectors of similar items should be close in the vector space
  2. Data needed: sequence data: user click/book sequence
  3. Method used: skip-gram with negative sampling

In this section, we’ll list the main changes to Item2Vec in each work. And feel free to re-visit the details of Item2Vec here.

Listing Embedding from Airbnb (2018)

Changes to Item2Vec

image
  1. Due to business context 3, most of the time, we don’t need hotel embedding to tell us: a hotel from Amsterdam is different from a hotel from London. So while doing negative sampling, there are two types of negative sample
    1. Soft negative: random sampled from hotels from all over the world
    2. Hard negative: random selected from the hotels of the same market
  2. Due to business context 4, it’s important to utlize current data more effecient. For example, booking is stronger signal than click or impression. So it’s better to treat them differently and assign more importance to the booked hotel
    1. So instead of local context word from the same context window in the work of Word2Vec, booked hotel is treated as a global context word
  3. Based on busines context 1 and 2, by combining hotel meta-data and embeding, we can get the embedding for cold-start hotel. In this case, geographic location, listing type and price bucket are used. When there is a new listing:
    1. Find three geographically closest listings that have embeddings, which are of the same listing type and belong to the same price bucket as the new listing
    2. Calculate the mean vector using embedding from these 3 listings. Then this is the embedding of the new listing

Performance

Here are two examples of similar listings

image
image

In the original paper, it’s mentioned that

While some listing characteristics, such as price, do not need to be learned because they can be extracted from listing meta-data, other types of listing characteristics, such as architecture, style, and feel, are much harder to extract in the form of listing features.

By checking the obove examples, we can know that although model never know visual information, it still learn it.

Application → Detect User’s Short-term Preference in Real Time

While a user is searching, based on the hotels which he/she just clicked or checked, find the similar hotels with embeddding technique, then recommend these hotels to this user.

Hotel2Vec from Expedia (2020)

Changes to Item2Vec

Expedia adopts similar methodology as that from Airbnb, but a novel method is introduced to leverage hotel meta-data

  1. Based on business context 1, the work from Expedia trys to include hotel metadata into hotel embedding
    1. hotel meta-data here main refers to amenity informations and Geo informations
    2. The embedding extracted from user click sequence is referred as click embedding. By combine click embedding , amenity embedding and Geo embedding , a Enriched embedding is introduced.
  2. For the problem of cold-start hotel (business context 2), this unified hotel embedding also help a lot, here is how it works
    1. First folowing the same steps as that from Airbnb, get the click embedding based on three existing hotels
    2. Then, feed the click embdding , amenity info and Geo info to the model, then we’ll get the enriched embedding
  3. In terms of business context 3, similar method is adopted as that from Airbnb to select negative sample from the same market
Instead of
Instead of click embedding from the work of Airbnb, an Enriched embedding is created by combining click embdedding , amenity embedding and geo-embedding

Performance & Application

During offline evaluation, this enriched embedding has shown to boost the performance of next item prediciton task, especially for cold-start hotels
For online test, hotel embeding are used as features to a DNN based ranker with pairwise loss, the test shows the improved performance for CVR and gross profit.

Hotel Embedding Space Alignment from Expedia (2022)

Changes to Item2Vec

In business context 6.a, a multi-brands scenarios from Expedia is introduced. Since these brands have different traveller profiles and local context, the embedding vectors learnt seperately from their user click sequence data might not be in the same vector space. It means that if we apply hotel embedding learnt from Expedia data to Hotel.com, we will see degraded performance. So the goal is to learn a common embedding to leverage the recommendation tasks in multiple brands.

The solution is domain alignment by adding a regularizer in the objective function while training the target domain. This regularizer will force embeddings for same hotel from source domain and target domain to be as close as possble, hence performing domain adaptions.

There is a souce domain(Hotel.com) and target domain(Expedai), for a hotel hiHh_i \in H, the embedding learnt from these domains are represented as VhiSV_{h_{i}}^S and VhiTV_{h_{i}}^T. Here are the steps

  1. Learn VhiSV_{h_{i}}^S based on Hotel2Vec method introduced in previous section.
  2. Learn VhiTV_{h_{i}}^T. If J(θ)J(\theta) represents the standard loss function for Item2Vec (Please re-visit part 1 for recap), then the loss funtion for target domain will be
    1. Jt(θ)=J(θ)+λVhtTVhtS2J_t(\theta) = J(\theta) + \lambda||V_{h_t}^T - V_{h_t}^S||_2
    2. Where λ\lambda defines how much knowledge we want to transfer from source domain to target domain
  3. Several details
    1. While training embddings for target domain, the embedding from source domain will not be updated
    2. The regularization is defined globally instead of hotel level, which might be one of the future work
    3. To avoid noise, the alignment is only learnt on common hotels

Performance

By check performance with the task of next-item recommendtation, here is the conclusion

  1. The proposed method can align the domain while achieving better performance in both brands
  2. In addition, the embedding can be learnt fast
More Detials

Embedding for Everything: User, Search Query, Travel Concept, User_type …

Collaborative Filtering Embedding from Agoda (2018)

In part one of this series, collaborative filtering embedding is introduced, which is also introduced as one of the method for embedding based candiation generation in part two. As mentioned previously, Maxtrix Factorication(MF) is one of them. And based on this method. we can get user embedding and item embedding at the same time.

In one of Agoda’s work, MF was utlized to learn hotel embedding and user embedding . For the first version, there isn’t so much changes compared to the standard MF: numerical method of Alternative Least Square (ALS) was applied to get the user and hotel embedding.

Then business context 3 is mentioned:

We don’t really need a data science model to tell us that hotels in Bangkok are probably different from hotels in Helsinki.

Then the 2nd iteration begins. The assumption is that by removing the clear information of city from model, model can focuses on the most important information. And we can get better embeddings.

However, in ALS, a user-item matrix is decomposed to user matrix and item maxtrix, which means that the embedding are learnt globally and it’s hard to remove city info with ALS.

So pairwise methodology Beyesian Personalized Ranking (BPR) is introduced. While training, tuples of (user,hotelnegative,hotelpositive)(user, hotel_{negative}, hotel_{positive}) are needed. And while sampling negative samples, rules are set to make sure negative samples are from the same city . In this case, model only need to learning the similarity and difference between hotels from the same city . This is actually very similar to what Airbnb and Expedia did while sampling negatives from the same market

Modelling User’s Long-term Preference from Airbnb (2018)

To model user’s long-term preference, generally, we’ll need

  1. More data about this user
  2. Longer historical data about this user

However, due to business context 4, it’s a little hard for hospitality sector. From the same work introducing hotel embedding at Airbnb, a method about user_type embedding and item_type embedding are introduced. In this work, user_type and item_type are created based on rules, hence there are more and longer data for user_type and item_type.

Definition of user_type and item_type

Booking data from a long history are used to get these embeding. The key is to map user_type embedding and item_type embedding to the same vector space. A novel method is introduced to get these embeddings:

  1. Sequence data are generated based on bookings from the same user_id
    1. Based on the definition of user_type and item_type , it’s clear that even though data are from the same user_id, user_type and item_type actually changes in the sequence
  2. For a specific sequence, based on the time when bookings are made, a sequence of (user_type, item_type) tuples can be created → ((utype1,ltype1),(utype2,ltype2)...(utypeM,ltypeM))((u_{type1},l_{type1}), (u_{type2},l_{type2})... (u_{typeM},l_{typeM}))
  3. Then here is the novel method: user_type and item_type are treated equally in the sequence.
    1. So the sequence becomes → utype1,ltype1,utype2,ltype2...utypeM,ltypeMu_{type1},l_{type1}, u_{type2},l_{type2}... u_{typeM},l_{typeM}
  4. Then train the embedding based on Item2Vec. In this case, the embeddings of user_type and item_type are mapped to the same vector space
  5. In addition, booking data not only reflects guest’s preference, but also host’s preference since host can reject a guest (business context 6). So in order to reduce future rejection in addition to maximizing booking changes, host rejections are treated negative sample while training

image

Travel Concepts Embedding from Expedia(2024)

In the original blog, the problem is defined as:

Usually, the traveler has specific needs or desires within a certain context, such as hotels with family-friendly beaches for summer vacation. While associating the entity beaches with certain hotels can be straightforward using geographical information, it is not evident how to associate the concept family-friendly beaches with a hotel in a beach destination.

So what they want to do is retrieve hotel based on some travel concepts such as luxury, family-friendly or Mountain , which is quite similar to the embedding based candidate generation in part 2 of this series

The solution is, as shown below, to map hotel and travel concepts to the same vector space. Then, how to make it? Remember two-tower neural network for embedding based candidate genraton? Yes. That’s the answer

image
image

More details:

  1. Training data is mainly created based on hotel reviews; the travel concepts are extrated using an in-house tagging system
  2. Hotel embedding is actually initialized from the work of hotel2vec
  3. How to define positive samples? : if this travel concept is mentioned postively in review of this hotel, then we get positive sample
  4. How to define negative samples? : when this travel concept is mentioned negatively or not mentioned in the review of this hotel, then we get negative sample. Of course, random sampling is needed
  5. What’s the loss function? it’s a pairwise margin loss, for a hotel hh, a negative and a postive travel concept: c+c^{+} and cc^{-}. the loss function is
L=max{0,mf(h,c+)+f(h,c)}L = max\{0, m-f(h, c^{+})+f(h, c^{-})\}

where mm is the margin.

Query Embedding from Airbnb (2017)

Use Case

The use case is auto-completion in the search box, where user can search the desired places. However, not all of users have a clear idea about the places that they are going, so they might search, for example, France skiing . In this case, the best auto-completion options should be some famous skiing locations in France such as Chamonix and Morzine.

Solution

The solution should retrieve geo-location based users’ non-geo query. It sounds pretty simiar as the travel concept embedding from Expedia. Then the solution is clear: map the geo-query (namely locations) and non-geo-query to the same vector space, so that the relevant pairs are closer in that space.

Then how to solve it: similar idea as Word2Vec

For a specific user, get his/her search queries ordered in time, which includes both geo-query and non-geo-query.
  • If a geo-query and a non-geo-query are in the same sequence, then positive pairs
  • For negative sample, similar idea: random sample some geo-query and non-geo-query pairs

Result

The result is quite interesting. Below are examples comparing auto-completion result before and after this changes

Search for
Search for France Skiiing
Search for
Search for Greek Islands

Reference