Welcome to the Python Data Engineer learning repository! This repo contains a structured, practical set of Jupyter notebooks for learning core Python concepts, especially with a focus on data engineering. Each topic is covered with hands-on examples and explanations, and links are provided to the code for easy reference.
Note: This summary is based on the top-level files; for a full list of all tutorials and scripts, check the GitHub repository contents.
- Overview: Introduction to Python, variables, data types, and basic operations.
- Key Concepts:
- Printing and string manipulation
- Variable assignment and naming
- Numeric, string, and boolean data types
- Type conversion, built-in functions, and string methods
- List basics and common list operations
- Overview: Mastering conditional statements for decision making.
- Key Concepts:
if,elif,elsestatements- Comparison and logical operators
- Nested conditions and practical examples
3. Python Loops
- Overview: Using loops to automate repetitive tasks.
- Key Concepts:
forandwhileloops- Loop control (
break,continue,pass) - Looping through lists, strings, and dictionaries
- Overview: Writing reusable blocks of code with functions.
- Key Concepts:
- Defining and calling functions
- Parameters, return values, and scope
- Lambda functions and higher-order functions
- Overview: Using operators to manipulate data.
- Key Concepts:
- Arithmetic, assignment, comparison, logical, bitwise, and membership operators
- Precedence and associativity
- Overview: Mastering data structures for efficient storage and retrieval.
- Key Concepts:
- Lists, tuples, sets, dictionaries
- When and how to use each collection
- Real-world data engineering examples using collections
- Overview: Organizing and reusing code with modules and packages.
- Key Concepts:
- The difference between modules, packages, and libraries (with LEGO analogies)
- Importing and using built-in and external libraries (e.g., Pandas, NumPy, Matplotlib, Requests, Scikit-learn)
- Creating custom modules and packages
- backup, blocks, database, files, json, logging, os, random, streamlit
These folders contain additional scripts covering the topics of file handling, json data handling, csv data handling, random and faker modules, os module and Streamlit, relevant to advanced data engineering, utility scripts, or project-specific examples.- Explore these folders directly: Repository Contents
- Browse Notebooks: Start with the Jupyter notebooks in the main directory for a structured learning path.
- Explore Directories: Check out the additional folders for more scripts and data.
- Try the Code: Run the notebooks locally or in an online Jupyter environment.
- Contribute: Pull requests to add new topics or improve examples are welcome!