|
| 1 | +原文:[An Introduction to Scientific Python (and a Bit of the Maths Behind It) - Matplotlib](http://www.jamalmoir.com/2016/04/scientific-python-matplotlib.html) |
| 2 | + |
| 3 | +--- |
| 4 | + |
| 5 | +This then leads topics such as the |
| 6 | +analysis of 'big data' which has many applications in pretty much every type |
| 7 | +of business you can imagine, and a personal interest of mine; Machine |
| 8 | +Learning. |
| 9 | +Python最流行的用法之一,特别是在近几年,是数据处理,分析和可视化。这导致了主题,例如“大数据”分析,这一 |
| 10 | + |
| 11 | +Python has a vast array of powerful tools available to help with this |
| 12 | +processing, analysis and visualisation of data and is one of the main reasons |
| 13 | +that Python has gained such momentum in the scientific world. |
| 14 | + |
| 15 | +In this series of posts, we will take a look at the main libraries used in |
| 16 | +scientific Python and learn how to use them to bend data to our will. We won't |
| 17 | +just be learning to churn out template code however, we will also learn a bit |
| 18 | +of the maths behind it so that we can understand what is going on a little |
| 19 | +better. |
| 20 | + |
| 21 | +So let's kick things off with a incredibly useful little number that we will |
| 22 | +be using throughout this series of posts; Matplotlib. |
| 23 | + |
| 24 | + |
| 25 | +### 什么是MATPLOTLIB? |
| 26 | + |
| 27 | +Simply put it's a graphing library for Python. It has a humongous array of |
| 28 | +tools that you can use to create anything from simple scatter plots, to sin |
| 29 | +curves, to 3D graphs. It is used heavily in the scientific Python community |
| 30 | +for data visualisation. |
| 31 | + |
| 32 | +You can read more about the ideas behind Matplotlib on their |
| 33 | +[website](http://matplotlib.org/), but I especially recommend taking a look at |
| 34 | +their [gallery](http://matplotlib.org/gallery.html) to see the amazing things |
| 35 | +you can pull off with this library. |
| 36 | + |
| 37 | + |
| 38 | + |
| 39 | +### 绘制一个简单的图 |
| 40 | + |
| 41 | +To get started we will plot a simple sin wave from 0 to 2 pi. You will notice |
| 42 | +that we are using Numpy here, don't worry too much about it for now if you |
| 43 | +don't know how to use it; we will be covering Numpy in the next post. |
| 44 | + |
| 45 | +```python |
| 46 | + import matplotlib.pyplot as plt |
| 47 | + import numpy as np |
| 48 | +``` |
| 49 | + |
| 50 | +These are the imports we will be using. As I've mentioned in a previous |
| 51 | +[post](http://www.jamalmoir.com/2016/02/pythonic-idioms-others.html) (and |
| 52 | +[others](http://www.jamalmoir.com/2016/04/how-to-build-gui-in-python-3.html)) |
| 53 | +the 'from x import *' way of importing is not good. We don't want to be typing |
| 54 | +out matplotlib.pyplot and numpy all the time though, they are long, so we will |
| 55 | +use the above compromise. |
| 56 | + |
| 57 | +```python |
| 58 | + # Basic plotting. |
| 59 | + x = np.linspace(0, 2 * np.pi, 50) |
| 60 | + plt.plot(x, np.sin(x)) # Without the first x, array indices will be used on the x axis. |
| 61 | + plt.show() # Show the graph. |
| 62 | +``` |
| 63 | + |
| 64 | + |
| 65 | +The above code will produce a simple sin curve. The 'np.linspace(0, 2 * np.pi,50)' bit of code produces an array of 50 evenly spaced numbers from 0 to 2 pi. |
| 66 | + |
| 67 | +The plot command is the short and sweet line of code that actually creates the |
| 68 | +graph. Note that without the first x argument used here, instead of the x axis |
| 69 | +going fro 0 to 2 pi, it would instead use the array indices used in the x |
| 70 | +variable instead. |
| 71 | + |
| 72 | +The final bit of code plt.show() displays the graph, without this nothing will |
| 73 | +appear. |
| 74 | + |
| 75 | +You will get something like this: |
| 76 | + |
| 77 | +[](https://2.bp.blog |
| 79 | +spot.com/-VS9khXhPaQ0/Vx8bh6VyJmI/AAAAAAAAC3M/JU7-X7SFYiY7Y4v- |
| 80 | +SlJQGNRolStNlDfGwCLcB/s1600/basic_plotting.png) |
| 81 | + |
| 82 | + |
| 83 | + |
| 84 | +### 在一张图上绘制两个数据集 |
| 85 | + |
| 86 | +A lot of the time you will want to plot more than one dataset on a graph. In |
| 87 | +Matplotlib this is simple. |
| 88 | + |
| 89 | +```python |
| 90 | + # Plotting two data sets on one graph. |
| 91 | + x = np.linspace(0, 2 * np.pi, 50) |
| 92 | + plt.plot(x, np.sin(x), |
| 93 | + x, np.sin(2 * x)) |
| 94 | + plt.show() |
| 95 | +``` |
| 96 | + |
| 97 | + |
| 98 | +The above code plots both the graphs for sin(x) and sin(2x). It is pretty much |
| 99 | +the same as the previous code for plotting one dataset, except this time |
| 100 | +inside the same plt.plot() call, we define another dataset separated by a |
| 101 | +comma. |
| 102 | + |
| 103 | +You will end up with a graph with two lines on like this: |
| 104 | + |
| 105 | +[](https: |
| 107 | +//2.bp.blogspot.com/-Dxcwog-mOwY/Vx8f-srDOnI/AAAAAAAAC3k/Fc0bV86B_LIPayQ- |
| 108 | +liK8vyWBdHnOmYAygCLcB/s1600/plotting_two_datasets.png) |
| 109 | + |
| 110 | + |
| 111 | + |
| 112 | +### 自定义线的样式 |
| 113 | + |
| 114 | +When having multiple datasets on one graph it is useful to be able to change |
| 115 | +the look of the plotted lines to make differentiating between the datasets |
| 116 | +easier. |
| 117 | + |
| 118 | +```python |
| 119 | + # Customising the look of lines. |
| 120 | + x = np.linspace(0, 2 * np.pi, 50) |
| 121 | + plt.plot(x, np.sin(x), 'r-o', |
| 122 | + x, np.cos(x), 'g--') |
| 123 | + plt.show() |
| 124 | +``` |
| 125 | + |
| 126 | + |
| 127 | +In the above code you can see two examples of different line stylings; 'r-o' |
| 128 | +and 'g--'. The letters 'r' and 'g' are the line colours and the following |
| 129 | +symbols are the line and marker styles. For example '-o' creates a solid line |
| 130 | +with dots on and '--' creates a dashed line. As with most of the aspects of |
| 131 | +Matplotlib, the best thing to do here is play. |
| 132 | + |
| 133 | + |
| 134 | +> **Colours:** |
| 135 | +Blue - 'b' |
| 136 | +Green - 'g' |
| 137 | +Red - 'r' |
| 138 | +Cyan - 'c' |
| 139 | +Magenta - 'm' |
| 140 | +Yellow - 'y' |
| 141 | +Black - 'k' ('b' is taken by blue so the last letter is used) |
| 142 | +White \- 'w' |
| 143 | + |
| 144 | +> **Lines:** |
| 145 | +Solid Line - '-' |
| 146 | +Dashed - '--' |
| 147 | +Dotted - '.' |
| 148 | +Dash-dotted - '-:' |
| 149 | + |
| 150 | +> **Often Used Markers:** |
| 151 | +Point - '.' |
| 152 | +Pixel - ',' |
| 153 | +Circle - 'o' |
| 154 | +Square - 's' |
| 155 | +Triangle - '^' |
| 156 | +For more markers click [here](http://matplotlib.org/api/markers_api.html). |
| 157 | + |
| 158 | + |
| 159 | +You will end up with something like this: |
| 160 | + |
| 161 | +[](https://2.bp.blogspo |
| 163 | +t.com/-SyYPwSwE8Wo/Vx8gGT4CFVI/AAAAAAAAC3o/QlbAw5rb2SgKkN6rDKHiuS1YSfYi9LMXwCL |
| 164 | +cB/s1600/line_customisation.png) |
| 165 | + |
| 166 | + |
| 167 | + |
| 168 | +### USING SUBPLOTS |
| 169 | + |
| 170 | +Subplots allow you to plot multiple graphs in one window. |
| 171 | + |
| 172 | +```python |
| 173 | + # Using subplot. |
| 174 | + x = np.linspace(0, 2 * np.pi, 50) |
| 175 | + plt.subplot(2, 1, 1) # (row, column, active area) |
| 176 | + plt.plot(x, np.sin(x), 'r') |
| 177 | + plt.subplot(2, 1, 2) |
| 178 | + plt.plot(x, np.cos(x), 'g') |
| 179 | + plt.show() |
| 180 | +``` |
| 181 | + |
| 182 | + |
| 183 | +When using subplots, we plot datasets as in the previous examples but with one |
| 184 | +extra step. Before calling the plot() function, we first call the subplot() |
| 185 | +function. The first argument is the number of rows you want the subplot to |
| 186 | +have, the second is the number of columns and the third is the active area. |
| 187 | + |
| 188 | +The active area is the current subplot you are working on now and are numbered |
| 189 | +from left to right, up to down. For example in a 4x4 grid of subplots, the |
| 190 | +active area 6 would be (2,2) on the grid. |
| 191 | + |
| 192 | +You should have two graphs like this: |
| 193 | + |
| 194 | +[](https://3.bp.blogspot.com/-QJ25 |
| 196 | +uc76pkI/Vx8gYV4GXDI/AAAAAAAAC3s/3UZLSVBigYwkrrojLqeQU1At9xI3K2FKQCLcB/s1600/su |
| 197 | +bplot.png) |
| 198 | + |
| 199 | + |
| 200 | + |
| 201 | +### SIMPLE SCATTER GRAPHS |
| 202 | + |
| 203 | +Scatter graphs are a collection of points that are not connected by a line. |
| 204 | +Again, Matplotlib makes this a trivial task. |
| 205 | + |
| 206 | +```python |
| 207 | + # Simple scatter plotting. |
| 208 | + x = np.linspace(0, 2 * np.pi, 50) |
| 209 | + y = np.sin(x) |
| 210 | + plt.scatter(x,y) |
| 211 | + plt.show() |
| 212 | +``` |
| 213 | + |
| 214 | + |
| 215 | +As the above code shows, all you do is call the scatter() function and pass it |
| 216 | +two arrays of x and y coordinates. Note that this can also be reproduced by |
| 217 | +using the plot command with the line styling 'bo'. |
| 218 | + |
| 219 | +You should end up with a graph with no line like so: |
| 220 | + |
| 221 | +[](https://1.bp.blogspo |
| 223 | +t.com/-P_lnwSgm0SE/Vx8giEY0AGI/AAAAAAAAC30/cPg3-56N04s- |
| 224 | +VZh23EXy5cPlAFWKTcV7ACLcB/s1600/scatter_plot.png) |
| 225 | + |
| 226 | + |
| 227 | + |
| 228 | +### COLOUR MAP SCATTER GRAPHS |
| 229 | + |
| 230 | +Another graph you might want to produced is a colour mapped scatter graph. |
| 231 | +Here we will vary the colour and the size of each point according to the data |
| 232 | +and add a colour bar too. |
| 233 | + |
| 234 | +```python |
| 235 | + # Colormap scatter plotting. |
| 236 | + x = np.random.rand(1000) |
| 237 | + y = np.random.rand(1000) |
| 238 | + size = np.random.rand(1000) * 50 |
| 239 | + colour = np.random.rand(1000) |
| 240 | + plt.scatter(x, y, size, colour) |
| 241 | + plt.colorbar() |
| 242 | + plt.show() |
| 243 | +``` |
| 244 | + |
| 245 | + |
| 246 | +In the above code you can see np.random.rand(1000) a lot, the reason for this |
| 247 | +is that we are simply randomly generating data to plot. |
| 248 | + |
| 249 | +As before we use the same scatter() function, but this time pass it an extra |
| 250 | +two arguments, the size and the colour of the point we want to plot. By doing |
| 251 | +this, the points plotted on the graph will vary in size and colour depending |
| 252 | +on the data we pass. |
| 253 | + |
| 254 | +We then add a colour bar with the function colorbar(). |
| 255 | + |
| 256 | +You will end up with a colourful scatter graph that will look something like |
| 257 | +this: |
| 258 | + |
| 259 | +[](https://1.bp.blogspot. |
| 261 | +com/-H5K-UlMP0M8/Vx8gvBZc1fI/AAAAAAAAC38/0kszxuM08FII1yfAAOdRn4ZtXbTaCDhZgCLcB |
| 262 | +/s1600/colormap_scatter.png) |
| 263 | + |
| 264 | + |
| 265 | + |
| 266 | + |
| 267 | + |
| 268 | +### HISTOGRAMS |
| 269 | + |
| 270 | +Histograms are another type of graph that are used frequently and again can be |
| 271 | +created with very few lines of code. |
| 272 | + |
| 273 | +```python |
| 274 | + # Histograms |
| 275 | + x = np.random.randn(1000) |
| 276 | + plt.hist(x, 50) |
| 277 | + plt.show() |
| 278 | +``` |
| 279 | + |
| 280 | + |
| 281 | +A histogram is one of the simplest types of graphs to plot in Matplotlib. All |
| 282 | +you need to do is pass the hist() function an array of data. The second |
| 283 | +argument specifies the amount of bins to use. Bins are intervals of values |
| 284 | +that our data will fall into. The more bins, the more bars. |
| 285 | + |
| 286 | +You will now have a histogram like the following: |
| 287 | + |
| 288 | +[](https://1.bp.blogspot.com |
| 290 | +/-MOV5V-3EMBI/Vx8g6clgBDI/AAAAAAAAC4A/42F3uzL36REn1-14STovVpcZBxw_Nz4cQCLcB/s1 |
| 291 | +600/histogram.png) |
| 292 | + |
| 293 | + |
| 294 | + |
| 295 | +### |
| 296 | +TITLES, LABELS AND LEGENDS |
| 297 | + |
| 298 | +When quickly bringing up graphs for your own sake, you might not always need |
| 299 | +to label your graphs. However, when producing a graph that will be shown to |
| 300 | +other people, adding titles, labels and legends is a must. |
| 301 | + |
| 302 | +```python |
| 303 | + # Adding a title, axis labels and legend. |
| 304 | + x = np.linspace(0, 2 * np.pi, 50) |
| 305 | + plt.plot(x, np.sin(x), 'r-x', label='Sin(x)') |
| 306 | + plt.plot(x, np.cos(x), 'g-^', label='Cos(x)') |
| 307 | + plt.legend() # Display the legend. |
| 308 | + plt.xlabel('Rads') # Add a label to the x-axis. |
| 309 | + plt.ylabel('Amplitude') # Add a label to the y-axis. |
| 310 | + plt.title('Sin and Cos Waves') # Add a graph title. |
| 311 | + plt.show() |
| 312 | +``` |
| 313 | + |
| 314 | + |
| 315 | +To add legends to our graph, inside the plot() function we add the named |
| 316 | +argument 'label' and assign it what we want the line to be labelled with. We |
| 317 | +then call the legend() function and a legend will be placed on our graph. |
| 318 | + |
| 319 | +To add a title and labels all we have to do is use the self explanatory |
| 320 | +title(), xlabel() and ylabel() functions and we are done. |
| 321 | + |
| 322 | + |
| 323 | +You should now have a titled, labelled and legended graph like this: |
| 324 | + |
| 325 | +[](https://1.bp.blogspot.com/--zj |
| 327 | +IQwFCxMg/Vx8hDfO0YTI/AAAAAAAAC4I/ycHxd42hGlMHfCRumwF1DaozQYRJep56ACLcB/s1600/l |
| 328 | +abeling.png) |
| 329 | + |
| 330 | + |
| 331 | + |
| 332 | +This should be enough to get you going with visualisation of data using |
| 333 | +Matplotlib and Python, but is by no means exhaustive. One thing I strongly |
| 334 | +recommend you to do, as it really helped me get to grips with this tool, is to |
| 335 | +just play. Plot a few graphs, play with styling and subplots and you will know |
| 336 | +your way around Matplotlib in no time at all. |
| 337 | + |
| 338 | +This has been a post on data visualisation with Matplotlib and Python, the |
| 339 | +first in series of posts on scientific Python. I hope you have managed to |
| 340 | +learn something and feel more comfortable with the Matplotlib library now. |
| 341 | +Make sure you share this post so others can read and benefit from it too! Also |
| 342 | +don't forget to follow me on [Twitter](https://twitter.com/jamal_moir) or add |
| 343 | +me on [Google+](https://plus.google.com/101283112845335349608/posts) so you |
| 344 | +don't miss any future posts. |
| 345 | + |
0 commit comments