- Introduction
- Several Business Contexts
- 1. Unlike the words in NLP, there are extra meta-data for hotels
- 2. Cold-start problem for hotels
- 3. Users usually search for properties from the same market
- 4. Less user historical data compared to e-commerce
- 5. Changing user preference which is hard to capture
- 6. Platform-specific contexts
- Hotel Embedding Based on Item2Vec
- Listing Embedding from Airbnb (2018)
- Hotel2Vec from Expedia (2020)
- Hotel Embedding Space Alignment from Expedia (2022)
- Embedding for Everything: User, Search Query, Travel Concept, User_type …
- Collaborative Filtering Embedding from Agoda (2018)
- Modelling User’s Long-term Preference from Airbnb (2018)
- Travel Concepts Embedding from Expedia(2024)
- Query Embedding from Airbnb (2017)
- Reference
Introduction
As shown in the first blog of this series, embedding technique such as Item2Vec is borrowed from natural language processing. To apply these techniques in real-world scenario, adapting them based on business contexts is crucial.
In this blog, we’ll mainly take hospitality sector as an example to show these innovative adaptions and practice.
Several Business Contexts
First, let’s learn a few business contexts about hospitality sector. And most the changes to embedding technique are related to these business contexts
1. Unlike the words in NLP, there are extra meta-data for hotels
For example
- hotel attributes: porperty type, star rating, average user rating ..
- Amenity information: free breakfast, free wi-fi …
- Geographic information
So why not also encode these information into the embedding?
2. Cold-start problem for hotels
First, there are always new hotels. Second, as mentioned in part 1, user click sequence is usually used to get item embedding. However, there are always hotels that are never clicked by users or clicked only a few time.
Then how to get a good embedding for these cold-start hotels?
3. Users usually search for properties from the same market
When a user is searching properties from London, of course, property from Amsteram shouldn’t be shown.
So how does this info influence Item2Vec technique and how to utilize this info in Item2Vec?
4. Less user historical data compared to e-commerce
Due to the nature of the travel business, where users travel 1-2 times per year on average, bookings are a sparse signal, with a long tail of users with a single booking. It means there are less click or booking data related to a specific user.
Then with so few data, how can we learn good enough embedding?
5. Changing user preference which is hard to capture
Users’ preference changes, which is also related to previous point.
- In current seach session, user might be searching a hotel for business travel or famility travel, but all the data you have is this user’s last booking which happend 6 months ago. How do you know this user’s short-term preference?
- On the other side, users have long-term preference, for example, some users prefer relative cheap properties and they will always search for this kinds of properties. However, there are only one booking record from this user until now, how to detect this long-term preference?
6. Platform-specific contexts
For example,
- Expedia: in Expedia group, there are several brands such hotel.com and Expedia, which focus on different user profile. If we can use data from all of these brands, we might get better embedding. Then how can we learn a common embedding and, at the same time, the common embedding should degrade the performance in any of the brands?
- Airbnb:
- There are same room types from same hotel, and a hotel can be booked serveral time for a specific date. However, at Airbnb, there is no identical listings, and every listing can be only booked once for a day
- Unlike normal hotel platform, you can be rejected by a host, which is actually an important signal of host preference. How to include this signal in embedding?
Hotel Embedding Based on Item2Vec
Item2Vec is one of most extensive adopted technique in industry, and based on that, we can get hotel embeddings.
Here is a quick recap about Item2Vec:
- Optimization goal: learn item similarty; the embeding vectors of similar items should be close in the vector space
- Data needed: sequence data: user click/book sequence
- Method used: skip-gram with negative sampling
In this section, we’ll list the main changes to Item2Vec in each work. And feel free to re-visit the details of Item2Vec here.
Listing Embedding from Airbnb (2018)
Changes to Item2Vec
- Due to business context 3, most of the time, we don’t need hotel embedding to tell us: a hotel from Amsterdam is different from a hotel from London. So while doing negative sampling, there are two types of negative sample
- Soft negative: random sampled from hotels from all over the world
- Hard negative: random selected from the hotels of the
same market
- Due to business context 4, it’s important to utlize current data more effecient. For example,
booking
is stronger signal than click or impression. So it’s better to treat them differently and assign more importance to thebooked hotel
- So instead of
local context word
from the samecontext window
in the work of Word2Vec,booked hotel
is treated as aglobal context word
- Based on busines context 1 and 2, by combining hotel meta-data and embeding, we can get the embedding for cold-start hotel. In this case,
geographic location
,listing type
andprice bucket
are used. When there is a new listing: - Find three geographically closest listings that have embeddings, which are of the same listing type and belong to the same price bucket as the new listing
- Calculate the mean vector using embedding from these 3 listings. Then this is the embedding of the new listing
Performance
Here are two examples of similar listings
In the original paper, it’s mentioned that
While some listing characteristics, such as price, do not need to be learned because they can be extracted from listing meta-data, other types of listing characteristics, such as architecture, style, and feel, are much harder to extract in the form of listing features.
By checking the obove examples, we can know that although model never know visual information, it still learn it.
Application → Detect User’s Short-term Preference in Real Time
While a user is searching, based on the hotels which he/she just clicked or checked, find the similar hotels with embeddding technique, then recommend these hotels to this user.
Hotel2Vec from Expedia (2020)
Changes to Item2Vec
Expedia adopts similar methodology as that from Airbnb, but a novel method is introduced to leverage hotel meta-data
- Based on business context 1, the work from Expedia trys to include hotel metadata into hotel embedding
- hotel meta-data here main refers to amenity informations and Geo informations
- The embedding extracted from user click sequence is referred as
click embedding
. By combineclick embedding
,amenity embedding
andGeo embedding
, aEnriched embedding
is introduced. - For the problem of cold-start hotel (business context 2), this
unified hotel embedding
also help a lot, here is how it works - First folowing the same steps as that from Airbnb, get the
click embedding
based on three existing hotels - Then, feed the
click embdding
, amenity info and Geo info to the model, then we’ll get theenriched embedding
- In terms of business context 3, similar method is adopted as that from Airbnb to select negative sample from the
same market
click embedding
from the work of Airbnb, an Enriched embedding
is created by combining click embdedding
, amenity embedding
and geo-embedding
Performance & Application
enriched embedding
has shown to boost the performance of next item prediciton task, especially for cold-start hotelsHotel Embedding Space Alignment from Expedia (2022)
Changes to Item2Vec
In business context 6.a, a multi-brands scenarios from Expedia is introduced. Since these brands have different traveller profiles and local context, the embedding vectors learnt seperately from their user click sequence data might not be in the same vector space. It means that if we apply hotel embedding learnt from Expedia data to Hotel.com, we will see degraded performance. So the goal is to learn a common embedding to leverage the recommendation tasks in multiple brands.
The solution is domain alignment by adding a regularizer in the objective function while training the target domain. This regularizer will force embeddings for same hotel from source domain and target domain to be as close as possble, hence performing domain adaptions.
There is a souce domain(Hotel.com) and target domain(Expedai), for a hotel , the embedding learnt from these domains are represented as and . Here are the steps
- Learn based on Hotel2Vec method introduced in previous section.
- Learn . If represents the standard loss function for Item2Vec (Please re-visit part 1 for recap), then the loss funtion for target domain will be
- Where defines how much knowledge we want to transfer from source domain to target domain
- Several details
- While training embddings for target domain, the embedding from source domain will not be updated
- The regularization is defined globally instead of hotel level, which might be one of the future work
- To avoid noise, the alignment is only learnt on common hotels
Performance
By check performance with the task of next-item recommendtation, here is the conclusion
- The proposed method can align the domain while achieving better performance in both brands
- In addition, the embedding can be learnt fast
Embedding for Everything: User, Search Query, Travel Concept, User_type …
Collaborative Filtering Embedding from Agoda (2018)
In part one of this series, collaborative filtering embedding is introduced, which is also introduced as one of the method for embedding based candiation generation in part two. As mentioned previously, Maxtrix Factorication(MF) is one of them. And based on this method. we can get user embedding
and item embedding
at the same time.
In one of Agoda’s work, MF was utlized to learn hotel embedding
and user embedding
. For the first version, there isn’t so much changes compared to the standard MF: numerical method of Alternative Least Square (ALS) was applied to get the user and hotel embedding.
Then business context 3 is mentioned:
We don’t really need a data science model to tell us that hotels in Bangkok are probably different from hotels in Helsinki.
Then the 2nd iteration begins. The assumption is that by removing the clear information of city from model, model can focuses on the most important information. And we can get better embeddings.
However, in ALS, a user-item matrix is decomposed to user matrix and item maxtrix, which means that the embedding are learnt globally and it’s hard to remove city info with ALS.
So pairwise methodology Beyesian Personalized Ranking (BPR) is introduced. While training, tuples of are needed. And while sampling negative samples, rules are set to make sure negative samples are from the same city
. In this case, model only need to learning the similarity and difference between hotels from the same city
. This is actually very similar to what Airbnb and Expedia did while sampling negatives from the same market
Modelling User’s Long-term Preference from Airbnb (2018)
To model user’s long-term preference, generally, we’ll need
- More data about this user
- Longer historical data about this user
However, due to business context 4, it’s a little hard for hospitality sector. From the same work introducing hotel embedding at Airbnb, a method about user_type embedding
and item_type embedding
are introduced. In this work, user_type
and item_type
are created based on rules, hence there are more and longer data for user_type
and item_type
.
user_type
and item_type
Booking data from a long history are used to get these embeding. The key is to map user_type
embedding and item_type
embedding to the same vector space. A novel method is introduced to get these embeddings:
- Sequence data are generated based on bookings from the same
user_id
- Based on the definition of
user_type
anditem_type
, it’s clear that even though data are from the sameuser_id
,user_type
anditem_type
actually changes in the sequence - For a specific sequence, based on the time when bookings are made, a sequence of (
user_type
,item_type
) tuples can be created → - Then here is the novel method:
user_type
anditem_type
are treated equally in the sequence. - So the sequence becomes →
- Then train the embedding based on Item2Vec. In this case, the embeddings of
user_type
anditem_type
are mapped to the same vector space - In addition, booking data not only reflects guest’s preference, but also host’s preference since host can reject a guest (business context 6). So in order to reduce future rejection in addition to maximizing booking changes, host rejections are treated negative sample while training
Travel Concepts Embedding from Expedia(2024)
In the original blog, the problem is defined as:
Usually, the traveler has specific needs or desires within a certain context, such ashotels with family-friendly beaches for summer vacation
. While associating the entitybeaches
with certain hotels can be straightforward using geographical information, it is not evident how to associate the conceptfamily-friendly beaches
with a hotel in a beach destination.
So what they want to do is retrieve hotel based on some travel concepts such as luxury
, family-friendly
or Mountain
, which is quite similar to the embedding based candidate generation in part 2 of this series
The solution is, as shown below, to map hotel and travel concepts to the same vector space. Then, how to make it? Remember two-tower neural network for embedding based candidate genraton? Yes. That’s the answer
More details:
- Training data is mainly created based on hotel reviews; the travel concepts are extrated using an in-house tagging system
- Hotel embedding is actually initialized from the work of hotel2vec
- How to define positive samples? : if this travel concept is mentioned postively in review of this hotel, then we get positive sample
- How to define negative samples? : when this travel concept is mentioned negatively or not mentioned in the review of this hotel, then we get negative sample. Of course, random sampling is needed
- What’s the loss function? it’s a pairwise margin loss, for a hotel , a negative and a postive travel concept: and . the loss function is
where is the margin.
Query Embedding from Airbnb (2017)
Use Case
The use case is auto-completion in the search box, where user can search the desired places. However, not all of users have a clear idea about the places that they are going, so they might search, for example, France skiing
. In this case, the best auto-completion options should be some famous skiing locations in France such as Chamonix and Morzine.
Solution
The solution should retrieve geo-location based users’ non-geo query. It sounds pretty simiar as the travel concept embedding from Expedia. Then the solution is clear: map the geo-query (namely locations) and non-geo-query to the same vector space, so that the relevant pairs are closer in that space.
Then how to solve it: similar idea as Word2Vec
- If a geo-query and a non-geo-query are in the same sequence, then positive pairs
- For negative sample, similar idea: random sample some geo-query and non-geo-query pairs
Result
The result is quite interesting. Below are examples comparing auto-completion result before and after this changes
France Skiiing
Greek Islands