|
11 | 11 | </tr> |
12 | 12 |
|
13 | 13 | <tr> |
14 | | -<td style="border: none;" align="left" width="20%"><a href="https://github.com/BethanyG/ML_Mondays_WWCodePython/tree/e8f9dfebbc6d124f491f21f8147a77c3f06c804d"><img alt="ML Mondays study group."align="left" src="images/ML Mondays_II.png"/> </td> |
| 14 | +<td style="border: none;" align="left" width="20%"><a href="https://github.com/BethanyG/ML_Mondays_WWCodePython"><img alt="ML Mondays study group."align="left" src="images/ML Mondays_II.png"/> </td> |
15 | 15 | <td colspan="3"><b>ML Mondays</b> <p>Join us alternating <b>Mondays</b> for a wholesome & healthy dose of <b>ML 🌟</b>. -- Starting off with a whirlwind review of Python & then diving into foundational libraries. |
16 | 16 | <br><br> |
17 | 17 | We'll also be discussing the ideas behind ML and covering a little ✨math & statistics✨🎉. As we journey further along, we'll collaborate & help one another with projects & other fun 🔥 stuff.</p><em>- By Yashika Sharma</em></td> |
18 | 18 | </tr> |
19 | 19 |
|
20 | 20 | <tr> |
21 | | -<td style="border: none;" align="left" width="20%"><a href="https://colab.research.google.com/drive/1NcSbNMgjMFqEl64qA0fmpRX7IrHuUx3u"><img alt="PySpark Part I."align="left" src="images/Pyspark Talk Part 1.png"/> |
22 | | - <a href="https://colab.research.google.com/drive/1T3bimqE9-OX4gSW4Zfjo3HjI-xPXBdK9"><img alt="PySpark Part II."align="left" src="images/Pyspark Talk Part 2.png"/> </td> |
| 21 | +<td style="border: none;" align="left" width="20%"><a href="https://colab.research.google.com/drive/1NcSbNMgjMFqEl64qA0fmpRX7IrHuUx3u"><img alt="PySpark Part I."align="left" src="images/Pyspark Talk Part 1.png"/> |
| 22 | + <a href="https://colab.research.google.com/drive/1T3bimqE9-OX4gSW4Zfjo3HjI-xPXBdK9"><img alt="PySpark Part II."align="left" src="images/Pyspark Talk Part 2.png"/> |
| 23 | + <a href="https://colab.research.google.com/drive/1T3bimqE9-OX4gSW4Zfjo3HjI-xPXBdK9"><img alt="PySpark Part II." align="left" src="images/Pyspark Talk Part 3.png"/> |
| 24 | + </td> |
23 | 25 |
|
24 | | -<td colspan="3"><br><b>ETL Made Simple with PySpark (parts I & II)</b> <p>Apache Spark is currently one of the most popular systems for large-scale data processing - making it a standard for any developer or data scientist interested in big data. Spark supports multiple widely used programming languages(Scala, Python, R, Java) and a wealth of built-in and third-party libraries.</p> |
| 26 | +<td colspan="3"><br><b>ETL Made Simple with PySpark</b> <p>Apache Spark is currently one of the most popular systems for large-scale data processing - making it a standard for any developer or data scientist interested in big data. Spark supports multiple widely used programming languages(Scala, Python, R, Java) and a wealth of built-in and third-party libraries.</p> |
25 | 27 |
|
26 | 28 | <br> |
27 | 29 |
|
28 | | -<p>In <b>Session I</b> you will be introduced to Apache Spark main concepts & you'll learn how to leverage the DataFrame API to extract data. You will also learn how to connect to different sources, apply schemas when reading data, and handle corrupt records. |
| 30 | +<p>In <b>Session I</b> you will be introduced to Apache Spark main concepts & you'll learn how to leverage the DataFrame API to extract data. You will also learn how to connect to different sources, apply schemas when reading data, and handle corrupt records. |
| 31 | + |
| 32 | + |
| 33 | +<b>PART I:</b> [](https://colab.research.google.com/drive/1NcSbNMgjMFqEl64qA0fmpRX7IrHuUx3u) |
| 34 | + |
29 | 35 |
|
30 | 36 | In <b>session II</b> you will be introduced to some of the most useful transformations - adding new columns, casting column types, renaming columns, etc. You'll also learn how to define User Defined Functions to do your own custom transformations & a get a little introduction to executing your own ad hoc SQL! |
31 | 37 |
|
32 | | -<br> |
33 | 38 |
|
34 | | -<b>PART I:</b> [](https://colab.research.google.com/drive/1NcSbNMgjMFqEl64qA0fmpRX7IrHuUx3u) |
| 39 | +<b>PART II:</b> [](https://colab.research.google.com/drive/1T3bimqE9-OX4gSW4Zfjo3HjI-xPXBdK9) |
| 40 | + |
| 41 | + |
| 42 | +In <b>session III</b> you'll analyze the robberies data by doing some aggregations & sorting. You'll learn how to convert Spark DataFrames to Pandas DataFrames. Additionally, you'll explore joins & lookup tables & write final results to CSV files. At the end of this session we'll go over best practices. |
| 43 | + |
35 | 44 |
|
36 | | -<b>PART II:</b> [](https://colab.research.google.com/drive/1T3bimqE9-OX4gSW4Zfjo3HjI-xPXBdK9) |
| 45 | +<b>PART III:</b> [](https://colab.research.google.com/drive/1x3HcVAs9HpUMgCGfbRfdq--6IQII0Fn5) |
37 | 46 |
|
38 | 47 | </p><em>- By Aida Martinez</em> |
39 | 48 |
|
|
0 commit comments