GitHub - hpinmetaverse/PCLI-Predictive_Customer_Lifecycle_Intelligence

Predictive Customer Churn Analysis using ML

Project Overview

This project focuses on Predictive Customer Churn Analysis using Machine Learning. Customer churn - the loss of clients to competitors - is a critical metric in subscription-based businesses like telecom or streaming services.

The goal is to build an interpretable ML pipeline that:

Predicts whether a customer is likely to churn
Explains why a prediction was made using Explainable AI (XAI) techniques
Deploys as a web application for real-time predictions

Dataset

IBM Telco Customer Churn Dataset (WA_Fn-UseC_-Telco-Customer-Churn.csv)

Size: 7,043 customer records
Features: 21 features including tenure, monthly charges, contract type, services used, payment method, etc.
Target: Binary classification — Churn (Yes/No)

Key Features

Feature	Description
`tenure`	Number of months the customer has stayed
`MonthlyCharges`	Monthly billing amount
`TotalCharges`	Total amount charged
`Contract`	Contract type: Month-to-Month, One year, Two year
`InternetService`	DSL / Fiber optic / No
`PaymentMethod`	Bank transfer / Credit card / Electronic check / Mailed check
`SeniorCitizen`	Whether the customer is a senior citizen

Methodology

1. Data Preprocessing

Dropped irrelevant customerID column
Handled missing/inconsistent values in TotalCharges
Binary encoding for boolean features (Partner, Dependents, PhoneService, etc.)
One-hot encoding for categorical features (InternetService, Contract, PaymentMethod)

2. Exploratory Data Analysis (EDA)

Identified key patterns and anomalies in the data
Analyzed churn distribution (class imbalance)
Feature correlation analysis

3. Feature Engineering

Engineered new features and transformed existing variables
Applied SMOTE (Synthetic Minority Over-sampling Technique) to handle class imbalance

4. Models Implemented

Model	Description
Logistic Regression	Baseline linear classifier
Decision Tree	Rule-based classifier
Random Forest	Ensemble model - best performing (used in deployment)
XGBoost	Gradient boosting classifier
MLP (PyTorch)	Custom multi-layer perceptron neural network

5. Model Evaluation

Evaluated using: Accuracy, Precision, Recall, F1 Score, ROC-AUC
Compared all models to identify best-performing algorithm
Final model: Random Forest Classifier

6. Explainable AI (XAI)

SHAP (SHapley Additive exPlanations):
- Global feature importance via summary plots
- Local per-prediction explanations via force plots
LIME (Local Interpretable Model-agnostic Explanations):
- Local explanations for individual customer predictions

Project Structure

minor-project-churn/
├── data/
│   └── WA_Fn-UseC_-Telco-Customer-Churn.csv   # IBM Telco Dataset
├── models/
│   ├── model_rfc.pkl                            # Trained Random Forest model
│   ├── mlp_model.pkl                            # Trained MLP model
│   └── explainer_rfc.bz2                       # Pre-computed SHAP explainer
├── templates/
│   └── index.html                              # Flask web app template
├── notebooks/
│   └── main.ipynb                              # Main analysis notebook
├── app.py                                      # Flask web application
├── requirements.txt                            # Python dependencies
├── Pipfile                                     # Pipenv config
└── README.md                                   # Project documentation

Setup & Installation

Prerequisites

Python 3.9+
pip or pipenv

Installation

# Clone the repository
git clone <your-repo-url>
cd minor-project-churn

# Create virtual environment
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Running the Web App

python app.py

Open http://127.0.0.1:5000 in your browser.

Running the Notebook

jupyter notebook notebooks/main.ipynb

Results

Best Model: Random Forest Classifier
Key Churn Predictors: Contract type, Tenure, Monthly Charges, Internet Service type
XAI Integration: SHAP force plots provide per-customer explanation; LIME provides local model transparency
Deployment: FastAPI web app with real-time churn probability gauge and SHAP explanation

Technologies Used

Category	Tools
Language	Python 3.9
ML Libraries	scikit-learn, XGBoost, imbalanced-learn (SMOTE)
Deep Learning	PyTorch
XAI	SHAP, LIME
Web Framework	FastAPI
Data Processing	pandas, NumPy
Visualization	Matplotlib, Seaborn
Notebook	Jupyter

References

IBM Telco Customer Churn Dataset - Kaggle
SHAP: Lundberg & Lee (2017) -"A Unified Approach to Interpreting Model Predictions"
LIME: Ribeiro et al. (2016) -"Why Should I Trust You?"
Scikit-learn documentation - https://scikit-learn.org

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
data		data
models		models
public		public
src		src
supabase		supabase
templates		templates
.gitignore		.gitignore
Pipfile		Pipfile
README.md		README.md
app.py		app.py
bun.lock		bun.lock
bun.lockb		bun.lockb
components.json		components.json
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
requirements.txt		requirements.txt
tailwind.config.ts		tailwind.config.ts
train_model.py		train_model.py
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predictive Customer Churn Analysis using ML

Project Overview

Dataset

Key Features

Methodology

1. Data Preprocessing

2. Exploratory Data Analysis (EDA)

3. Feature Engineering

4. Models Implemented

5. Model Evaluation

6. Explainable AI (XAI)

Project Structure

Setup & Installation

Prerequisites

Installation

Running the Web App

Running the Notebook

Results

Technologies Used

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Predictive Customer Churn Analysis using ML

Project Overview

Dataset

Key Features

Methodology

1. Data Preprocessing

2. Exploratory Data Analysis (EDA)

3. Feature Engineering

4. Models Implemented

5. Model Evaluation

6. Explainable AI (XAI)

Project Structure

Setup & Installation

Prerequisites

Installation

Running the Web App

Running the Notebook

Results

Technologies Used

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages