The rapid growth of bike-sharing systems has led to the generation of large volumes of data related to weather conditions, temporal patterns, and user behavior. Accurately predicting bike rental demand is critical for:
- Optimizing bike distribution and resource allocation
- Improving customer satisfaction
- Reducing operational and maintenance costs
This project builds an end-to-end machine learning regression system to predict the total number of bike rentals (cnt) using historical data and multiple influencing factors.
Bike rental demand is affected by a combination of:
- 🌦 Weather conditions: temperature, humidity, windspeed
- 📆 Temporal patterns: season, month, weekday, working day
- 🎉 External factors: holidays, weather situations
The challenge is to design a robust regression-based ML pipeline that can effectively learn these patterns and deliver accurate predictions.
- Perform data preprocessing and exploratory data analysis (EDA)
- Train and compare multiple regression models
- Evaluate models using multiple performance metrics
- Track experiments, metrics, and models using MLflow
- Register and load the best-performing model
- Build an interactive Streamlit application for predictions and visualization
The following regression models are trained and evaluated:
- Linear Regression
- Random Forest Regressor
- Gradient Boosting Regressor
- K-Nearest Neighbors (KNN) Regressor
- Support Vector Regressor (SVR)
- LightGBM Regressor (LGBM)
Each model is evaluated using:
- R² Score
- Adjusted R²
- Mean Absolute Error (MAE)
- Mean Squared Error (MSE)
- Root Mean Squared Error (RMSE)
- Accuracy-like Metric →
R² × 100
This project leverages MLflow to ensure reproducibility and experiment tracking:
- Logs model parameters
- Tracks evaluation metrics
- Stores trained models
- Registers the best-performing model
- Enables model loading for deployment
An interactive Streamlit web app is built to:
- Input environmental and temporal features
- Predict bike rental demand in real time
- Visualize predictions and model performance
- Load models directly from MLflow Registry
git clone https://github.com/your-username/bike-demand-prediction.git
mlflow ui
Access at: http://localhost:5000
Identify the best-performing regression model for bike demand prediction
Understand feature importance impacting bike rentals
Enable reproducible ML workflows using MLflow
Python
Scikit-learn
LightGBM
MLflow
Streamlit
Pandas, NumPy, Matplotlib, Seaborn