This repository contains my notes and support documents for the data science career tack course from Springboard.com. It also serves as an index (and a showcase) for my capstone projects, my mini-projects for data science career track course on Springboard.com mainly, and on other courses or projects.
A capstone project is a relatively large project where multiple aspects of data science are employed, including data wrangling, inferential statistics, and machine learning.
This project comes from a Kaggle competition. Walmart wants to know the correlation between weather condition and its sales record. More specifically, Walmart wants a model that can "accurately predict the sales of 111 potentially weather-sensitive products (like umbrellas, bread, and milk) around the time of major weather events at 45 of their retail locations".
To this end, in this capstone project, I would like to accomplish the following objects: 1.A model that will predict product sales in major weather events 2.Evaluate how the weather condition affect the sales
File location: https://github.com/ZackCode/Capstone1_Walmert_Sales
This project comes from the Kaggle competition "Data Science Bowl 2017". This year's topic is lung cancer detection. The goal defined by the competition is:
"Using a data set of thousands of high-resolution lung scans provided by the National Cancer Institute, participants will develop algorithms that accurately determine when lesions in the lungs are cancerous. "
To this end, in this capstone project, I would like to accomplish the following objects: 1.A model that will detect lung cancer. 2.Compare the performance versus other submissions.
File location: https://github.com/ZackCode/Capstone2_Cancer_Detection
A mini project is a small project where the task is limited to a certain area of data science.
- JSON Based Data Exercises Project report location: https://github.com/ZackCode/mini_project_data_wrangling_json/blob/master/sliderule_dsi_json_exercise.ipynb
- SQL Practice Cases at modeanalytics.com Project location: https://github.com/ZackCode/SQL_mini_projects_From_Mode/tree/master/Analysis_1
- Human Body Temperature Using EDA: Project location: https://github.com/ZackCode/mini_project_human_temp
- Examine Racial Discrimination Using EDA: Project Location: https://github.com/ZackCode/mini_project_racial_disc
- Reduce Hospital Readmissions Using EDA : Project Location: https://github.com/ZackCode/mini_project_hospital_readmit
- Linear Regression Using Boston Housing Data Set: Project Location: https://github.com/ZackCode/mini_project_linear_regression
- Heights and Weights Using Logistic Regression: Project Location: https://github.com/ZackCode/mini_project_logistic_regression
- Predicting Movie Ratings from Reviews Using Naive Bayes: Project Location: https://github.com/ZackCode/mini_project_naive_bayes
- Customer Segmentation Using Clustering: Project Location: https://github.com/ZackCode/mini_project_clustering
- MapReduce with Spark: Project Location: https://github.com/ZackCode/spark
To be added
To be added