Skip to content

Commit 790e797

Browse files
committed
new data and data analysis resources
1 parent 9081c53 commit 790e797

File tree

2 files changed

+50
-0
lines changed

2 files changed

+50
-0
lines changed

content/pages/03-data/00-data.markdown

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,51 @@ work of a massive number of engineers and scientists around the world who
6666
created the incredible mix of data code libraries available today.
6767

6868

69+
### Data inspiration
70+
Sometimes you just need to see it to understand how data analysis,
71+
visualization and storytelling can intersect in a meaningful way. The
72+
following resources do a great job of telling stories with data. There
73+
are more links to stories listed on the [data analysis](/data-analysis.html)
74+
and [data visualization](/data-visualization.html) pages.
75+
76+
* [Metadata Investigation : Inside Hacking Team](https://labs.rs/en/metadata/)
77+
presents what metadata is and how it can be used to track people even though
78+
it is often thought of as less of a problem than typical stored data.
79+
80+
* [A visual introduction to machine learning](http://www.r2d3.us/visual-intro-to-machine-learning-part-1/)
81+
provide a spectacular example of
82+
[data visualization](/data-visualization.html) to explain what a machine
83+
learning model does on a San Francisco and New York housing data set.
84+
85+
* [Earthquake recurrence and survival analysis: How long should we wait for an overdue earthquake?](http://rocksandwater.net/blog/2016/07/wrightwood-recurrence/)
86+
combines earthquake data with questions around earthquake recurrence
87+
probabilities to tell its story.
88+
89+
90+
### Example data sets
91+
Looking for freely-available data to use in your projects but aren't
92+
sure where to get it? The following links have large free, open data
93+
sets.
94+
95+
* Check out the
96+
[awesome public datasets](https://github.com/awesomedata/awesome-public-datasets)
97+
project repository for data in many different categories ranging from
98+
finance to museums.
99+
100+
* [Kickstarter datasets](https://webrobots.io/kickstarter-datasets/)
101+
are scraped JSON and CSV structured monthly data from Kickstarter
102+
projects.
103+
104+
* [Data is Plural](https://tinyletter.com/data-is-plural) is a weekly
105+
newsletter that highlights open data that you can use for your projects.
106+
I have been a subscriber to the newsletter for a couple of years now and
107+
love seeing the wide variety of data sources that are freely available.
108+
109+
* [Data analysis and machine learning projects](https://github.com/rhiever/Data-Analysis-and-Machine-Learning-Projects)
110+
provides more than just the data, it also includes instructions and
111+
code for working with the data in your own development environment.
112+
113+
69114
### General Python data resources
70115
* [PyData](http://pydata.org/) is a community for developer and users of
71116
Python data tools. They put on fantastic conferences around the world and fund

content/pages/03-data/15-data-analysis.markdown

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,11 @@ libraries from scratch.
7272
to show the most-commented issues and issues by version number
7373
throughout the project's history.
7474

75+
* [Divergent and Convergent Phases of Data Analysis](https://simplystatistics.org/2018/09/14/divergent-and-convergent-phases-of-data-analysis/)
76+
examines the flow most people doing data science and analysis projects
77+
go through during the exploration, synthesis, modeling and narration
78+
phases.
79+
7580
* [Gender Distribution in North Korean Posters with Convolutional Neural Networks](http://digitalnk.com/blog/2017/09/30/gender-distribution-in-north-korean-posters/)
7681
is a fascinating post that uses convolutional neural networks as a
7782
mechanism to identify gender by faces in North Korean posters. The

0 commit comments

Comments
 (0)