For course
- iris.csv
The Iris dataset was used in R.A. Fisher's classic 1936 paper, The Use of Multiple Measurements in Taxonomic Problems, and can also be found on the UCI Machine Learning Repository. It includes three iris species with 50 samples each as well as some properties about each flower. One flower species is linearly separable from the other two, but the other two are not linearly separable from each other. irisInfo.txt is a descrition of the file - train.csv is the Boston House prices. Data from kaggle.com
https://www.kaggle.com/c/house-prices-advanced-regression-techniques/overview
test.csv is the corresponding test file from kaggle.com - IDBM-Movies.csv
- ted.csv statistics from ted talks with data such as 'who did the talk', 'subject', 'event', 'ratings' etc. description
- wine.csv: Wine recognition dataset from UC Irvine Machine Learning Repository. For classifiers
- Titanic train.csv: Titanic survivals dataset from UC Irvine Machine Learning Repository.
- processed.cleveland.csv: Heart disease data from cleveland (a bit messed with for educational purposes) (https://archive.ics.uci.edu/ml/datasets/Heart+Disease)
- heart disease names: Description of the heart disease dataset processed.cleveland.csv
- train_loan.csv A Practice Problem: Loan Prediction III from Analytics Vidhya**
- Obama White House Documents a small extract of all the documents released by the Obama White House, now found on obamawhitehouse.archives.gov.