For the best viewing experience, please open this page on a desktop or laptop.

The Business Crisis

16.8% Churn Rate

947 out of 5,630 customers have churned, representing significant revenue loss

0-3 Month Window

New customers in their first quarter show the highest risk, with nearly 50% churn rate

$236K at Risk

Immediate revenue exposure from at-risk customers without intervention

Why This Matters

Customer acquisition costs in e-commerce typically range from $50-$200 per customer. With 947 churned customers, the company has lost between $47,350 - $189,400 in acquisition investments alone, not counting the lost lifetime value of these customers.

The 0-3 month churn crisis is particularly critical because:

  • These customers haven't recovered their acquisition cost
  • Early negative experiences create lasting brand damage
  • High early churn signals systemic onboarding or product issues
  • Prevention is 5-25x cheaper than reacquisition

Dataset Explorer

Explore both the raw and cleaned versions of the dataset used in this analysis

5,630 Customers

Complete customer records

20 Features

Behavioral & demographic data

E-Commerce Platform

Real-world transaction data

About the Raw Dataset

This dataset contains the original, unprocessed customer data from an e-commerce platform. It includes behavioral patterns, transaction history, satisfaction scores, and churn status for 5,630 customers. The raw data requires preprocessing to handle missing values, outliers, and prepare features for machine learning models.

Data Preview

Download Dataset

Click "Load Preview" to view the first 10 rows

5,630 Records

No missing values

Normalized Features

Ready for modeling

Feature Engineered

Enhanced predictive power

About the Cleaned Dataset

The cleaned dataset has undergone comprehensive preprocessing and feature engineering. This includes handling missing values, removing outliers, encoding categorical variables, creating interaction features, and normalizing numerical features. This dataset is production-ready for machine learning model training and achieves the 89.3% accuracy reported in this analysis.

Data Cleaning Pipeline:

  • Handled missing values using median/mode imputation
  • Removed outliers using IQR method
  • Encoded categorical variables (Label & One-Hot encoding)
  • Created tenure buckets (0-3, 3-6, 6-12, 12+ months)
  • Normalized continuous features using StandardScaler
  • Performed stratified sampling to handle class imbalance

Data Preview

Download Dataset

Click "Load Preview" to view the first 10 rows

Data Dictionary

Understanding the features used in the analysis

Customer Profile

CustomerID

Unique identifier for each customer

Tenure

Months as active customer (0-61 months)

PreferredLoginDevice

Primary device used: Mobile, Computer, Phone

CityTier

City classification: Tier 1, 2, or 3

Behavioral Metrics

WarehouseToHome

Distance between warehouse and customer (km)

PreferredPaymentMode

Preferred payment method

HourSpendOnApp

Average hours spent on mobile app

NumberOfDeviceRegistered

Total devices registered to account

SatisfactionScore

Customer satisfaction rating (1-5)

OrderCount

Total number of orders placed

Financial Indicators

CashbackAmount

Average cashback earned per order

OrderAmountHikeFromlastYear

Percentage increase in order value

CouponUsed

Number of coupons used in last month

Risk Indicators

Complain

Binary: Has filed complaint (Yes/No)

DaySinceLastOrder

Days since most recent purchase

Churn

Target variable: Customer churned (1) or active (0)

Initial Data Insights

~50%

Churn rate for customers in 0-3 month tenure

31.7%

Churn rate for customers who filed complaints

3-4

Satisfaction score range showing highest churn

52%

Tenure feature importance in model

Ready to dive deeper into the technical analysis?

Explore Insights Lab