Skip to content
View andre-salvati's full-sized avatar

Block or report andre-salvati

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
andre-salvati/README.md

Hi, I'm André Salvati 👋

Senior Data Engineer focused on Databricks, Spark, Delta Lake, and Lakehouse Architecture

I'm a Senior Data Engineer from Brazil with 20+ years of experience in software development, data engineering, analytics platforms, and cloud architectures.

My main focus today is helping companies build, modernize, and scale data platforms using Databricks, Apache Spark, Delta Lake, Unity Catalog, and production-grade software engineering practices.

I work especially well in projects involving:

  • Databricks Lakehouse architecture
  • Spark / PySpark data pipelines
  • Delta Lake optimization
  • Medallion Architecture
  • Unity Catalog governance
  • Databricks Asset Bundles
  • Workflows and job orchestration
  • CI/CD for data engineering
  • Cloud data platforms on AWS, Azure, and GCP
  • Data warehouse and Hadoop/EMR modernization

📫 Connect with me on LinkedIn


🚀 What I Do

I design and build robust, scalable, and cost-efficient data platforms.

Databricks & Lakehouse Engineering

  • Build Bronze, Silver, and Gold data layers
  • Design Medallion Architecture pipelines
  • Develop PySpark and SQL transformations
  • Implement Delta Lake best practices
  • Structure Databricks projects for real teams
  • Automate deployments with Databricks Asset Bundles
  • Configure Jobs, Workflows, environments, and parameters
  • Support Unity Catalog governance and access control

Data Engineering

  • ETL and ELT pipeline development
  • Batch and incremental processing
  • Data modeling for analytical workloads
  • Data quality and validation
  • Performance tuning and cost optimization
  • Semi-structured data processing, especially JSON
  • Cloud-native data lake and lakehouse design

Software Engineering for Data Teams

  • Clean Python project structure
  • Reusable packages and modules
  • Unit and integration tests
  • Pull request workflows
  • CI/CD with GitHub Actions and Azure DevOps
  • Infrastructure and deployment automation
  • Developer-friendly documentation

🛠️ Main Tech Stack

Databricks Apache Spark Delta Lake Python PySpark SQL AWS Azure GCP Terraform GitHub Actions


📌 Core Skills

Databricks          Apache Spark        PySpark
Delta Lake          Unity Catalog       Asset Bundles
Workflows           Medallion Arch.     Lakehouse
Data Lakes          Data Warehousing    Data Modeling
Python              SQL                 Terraform
AWS                 Azure               GCP
CI/CD               Git                 Automated Tests

Popular repositories Loading

  1. databricks-template databricks-template Public

    A production-ready PySpark project template with medallion architecture, Python packaging, unit tests, integration tests, CI/CD automation, Databricks Asset Bundles, and DQX data quality framework.

    Python 65 30

  2. covid-19 covid-19 Public

    Some charts to track global and local covid-19 evolution

    R 1

  3. stocks stocks Public

    Stock portfolio analysis

    R

  4. cdk cdk Public

    AWS CDK Demo with RDS, Redshift, Data Migration Service (DMS), and SCT (Schema Conversion Tool)

    Python

  5. andre-salvati andre-salvati Public