Query Auto Completion (QAC) - Metrics & Data

Query Auto Completion (QAC) - Metrics & Data

Tags
Data Science
Date Published
February 8, 2024

QAC Example & Term Definition

image

Term definition:

  1. Prefix: partial query
  2. Block (or column):
    1. suggested result list for prefix
    2. there is block for every prefix; there are several blocks within the same conversation
  3. Conversation:
    1. Begin when a user begins to type prefix; end when user choose one of the suggest result or abandon it

Two Important User Behavior

Skip Behavior

In a conversation, even though the suggested column contains users' final selected query, users frequently skip them.

The behind reasons are related to devices, typing skills of users. For example, fast typists tend to continue typing additional characters without examining the completions

Example of skipping behavior

Frequency of sipping behavior (a study from Yahoo QAC log data)

Example of skipping behavior
Example of skipping behavior
Frequency of skipping behaviro (study from Yahoo QAC log data)
Frequency of skipping behaviro (study from Yahoo QAC log data)

Observation Bias (position bias)

Similar to other ranking problems, most of the clicks (or interactions) concentrated on top positions

Due to the vertical setting, users will observe the top position first. It also depends on the UI (or device)

image

Evaluation Metrics

Learning from User Behavior

Based on the observation above, in a conversation there are actually several funnels from the point when users begin to type to that when user select one of the suggested query.

So for a suggest item as position n in the ith column. Here are the funnels

  1. Funnel 1: Whether users decide to stop typing and check the ith column
  2. Funnel 2: Whether users check the nth position
  3. Funnel 3: Whether suggested item is relevant to what the user want

What we really want to evaluate is the performance of the 3rd funnel, however, all of the data we’re going to collected mostly reflect the combined influence of all these 3 funnels. Keep it in mind while evaluating the auto completion system

Metrics

Basic Idea
Metrics
Data Needed
How many users started typing a query, but never actually selected a result
drop-off rate or abandonment rate
1. users' interactions with last column of each conversation
Whenever a result is selected, what was its position in the suggestion list? (or is the selected result in high position)
1. Average selected position 2. success rate of top K position: success rate@K 3. Mean Reciprocal Rank@K (MRR@K) 4. Mean Average Precision@K (MAP@K) 5. ….
1. Only conversations ended with click(or selection) are needed 2. Only users' interactions with last column of each conversation are needed
How many characters did the user have to type before s/he was able click on the result s/he was looking for?
1. Minimal keystrokes 2. Effort saved
1. Only conversations ended with click(or selection) are needed 2. Only users' interactions with last column of each conversation are needed

Data

Tracking

Based on the metrics above, it looks like users' interactions with last column of each conversation is more important. If resource is limited, we can track this data. However, if resource are enough, it will be helpful if we can track high-resolution log that records every keystroke

Here is an example of the processed data from Yahoo

image

Open Datasets

There are several open datasets from

  1. Yahoo
  2. AOL
  3. Bing

Reference