Market Basket Analysis

Market Basket

Have you ever gone to a retail store and noticed that some things are always kept together? For example, bread is generally kept near to milk, butter, jams.

For online stores, when you are viewing or have purchased an item, you get recommendations for related items suggested as "People who bought this, also bought...". For example, when you buy a mobile, you get suggestions for screen guard.

This is due to a relationship or association between the different items that is exhibited during a customer's purchase journey.

Market basket analysis is all about finding these relationships in a store or online shop. It is a data mining technique that helps stores to discover the association between products or services that are frequently purchased together by the customers

Market Basket analysis in various industries

Here are the steps to conduct Market Basket Analysis

Step 1: Collect Data

The store collects information about what people buy. This data is collected through sales transactions or customer loyalty programs. It includes details such as the items purchased, the date and time of the purchase, and the location of the store.

Step 2: Find Patterns

This involves using special computer programs to analyze the data and identify the most commonly purchased items. The program searches for products that are often purchased together. This is analysed using algorithms such as Apriori , FP-growth, Eclat algorithms

Step 3: Make Decisions

The store uses the information gathered to make decisions about how to sell things. This involves deciding on product placements, recommending products for cross-selling/up-selling, offering product bundles, profiling customers, etc.

ApriorI Algorithm

The Apriori algorithm is a widely used model for Market Basket Analysis. It works on the principle of association rule mining, where the algorithm tries to find the frequent itemsets and generates the association rules. 

The frequent itemsets are the sets of items that frequently occur together in the transactions. 

The association rules are the rules that show the relationship between the items based on the frequency of occurrence. 


Here is a sample Python code to perform Market Basket Analysis using the Apriori algorithm. 

import pandas as pd

from mlxtend.frequent_patterns import apriori

from mlxtend.frequent_patterns import association_rules

# Read the transaction data


# Perform Market Basket Analysis using Apriori algorithm

frequent_itemsets = apriori(df, min_support=0.1, use_colnames=True)

# Generate association rules

rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1)

# Print the association rules


# Display the rules in descending order of support

rules.sort_values('support', ascending = False)

# Display the rules in descending order of confidence above a certain support threshold

rules[( >= 0.03)].sort_values('confidence',ascending=False)

Evaluation Metrics 

The three most relevant evaluation metrics used in Market Basket Analysis are support, confidence, and lift. 


Q. What is the minimum support value to consider an itemset as frequent? 

A. The minimum support value depends on the dataset and the business problem. Generally, a minimum support value of 0.1 to 0.5 is considered. 

Q. How to interpret the association rules in Market Basket Analysis? 

A. The association rules can be interpreted based on the evaluation metrics such as support, confidence, and lift. For example, if the lift value is greater than 1, it indicates a positive association between the antecedent and consequent items. 

Q. Can Market Basket Analysis be used for customer segmentation? 

A. Yes, it can be used for customer segmentation by clustering the customers based on their purchasing behavior. 

Q: Can market basket analysis be used for predicting future sales or customer behavior?

A: Market basket analysis is primarily a descriptive technique that analyzes past transaction data to identify patterns and relationships. While it can be used to inform predictions and forecasts, it is not a predictive modeling technique in itself.

Q: What are some challenges in applying market basket analysis to large datasets?

A: Market basket analysis can be computationally intensive and time-consuming when applied to large datasets. This can be addressed by using parallel processing or distributed computing techniques, or by sampling the data to reduce the computational load.