Skip to Content
CLICore ConceptsTransformations

Transformations

Not all data needs to land in your destination exactly as it comes from the source. You might need to strip sensitive columns before they reach the database, rename tables to match your naming conventions, or build aggregate views for reporting after the data lands. Transformations handle both of these scenarios.

There are two categories, depending on when they run relative to data being written to your destination:

  • Pre-load transformations - transformer integrations that modify data in flight, between the source and destination.
  • Post-load transformations - dbt packages that run after data has landed in your destination to create analytics-ready views.

All transformations are available in CloudQuery Hub.

Pre-Load Transformations

Pre-load transformations are transformer integrations that sit between a source and destination in your sync pipeline. They intercept records after they leave the source and modify them before they are written to the destination. This is useful for shaping data on the fly - renaming tables, removing sensitive columns, or filtering out unwanted rows - without touching the source or destination configuration.

Transformer integrations are configured with kind: transformer in your CloudQuery configuration. You reference them in your destination spec under the transformers key, and the CLI chains them automatically during a sync.

For full configuration details, see the Transformer Integrations page.

Available Pre-Load Transformers

TransformerDescription
BasicRename tables, add or remove columns, obfuscate sensitive data, normalize casing, drop rows by value, and add timestamps or literal columns.
JSON FlattenerFlatten single-level JSON object fields into typed destination columns while preserving the original JSON column.

Example: Using the Basic Transformer

The following configuration adds the Basic transformer to a PostgreSQL destination. It renames all tables with a cq_sync_ prefix and obfuscates sensitive columns:

kind: destination spec: name: "postgresql" path: "cloudquery/postgresql" registry: "cloudquery" version: "v8.0.7" write_mode: "overwrite-delete-stale" migrate_mode: forced transformers: - "basic" spec: connection_string: "postgresql://your.user:your.password@localhost:5432/db_name" --- kind: transformer spec: name: "basic" path: "cloudquery/basic" registry: "cloudquery" version: "VERSION_TRANSFORMER_BASIC" spec: transformations: - kind: obfuscate_sensitive_columns - kind: change_table_names tables: ["*"] new_table_name_template: "cq_sync_{{.OldName}}"

Basic Transformer Capabilities

The Basic transformer supports the following transformation kinds:

  • remove_columns - Remove columns from specified tables. Supports JSON paths for nested fields.
  • add_column - Add a literal string column to specified tables.
  • add_current_timestamp_column - Add a column with the timestamp the record was processed.
  • add_primary_keys - Add additional primary key columns.
  • obfuscate_columns - Replace column values with a redacted placeholder. Supports JSON paths including array syntax (#.field).
  • obfuscate_sensitive_columns - Automatically obfuscate all columns the source has marked as sensitive.
  • change_table_names - Rename tables using a Go template ({{.OldName}}).
  • rename_column - Rename a single column on specified tables.
  • uppercase / lowercase - Normalize column values to upper or lower case.
  • drop_rows - Drop rows where a column matches a given value.

Note: Transformations are applied sequentially. If you rename tables, subsequent transformations must reference the new table names.

Post-Load Transformations

Post-load transformations are dbt packages that run after data has been synced to your destination. They create analytics-ready models, views, and dashboards on top of the raw synced data. These are useful for asset inventory reporting, cost analysis, and backup compliance - scenarios where you want to aggregate, join, or reshape data that has already landed.

Post-load transformations are standalone dbt projects. You install them with dbt, configure a profile for your destination database, and run them with dbt run after a CloudQuery sync completes.

Available Post-Load Transformations

TransformationDescription
AWS Asset InventoryAutomated line-item listing of all active resources in your AWS environment. Creates an aws_resources view with ARN, account, region, tags, and more.
AWS CostAnalyze AWS expenses across services and accounts. Provides cost breakdowns for budgeting and optimization.
AWS Data Resilience (Backup)Evaluate backup and disaster recovery coverage across your AWS resources.
Azure Asset InventoryCatalog Azure resources across subscriptions with a unified resource view.
GCP Asset InventoryCatalog Google Cloud resources across projects with a unified resource view.

These transformations can be paired with BI tools like Grafana, Apache Superset, QuickSight, PowerBI, and others to visualize and monitor cloud infrastructure.

Creating Custom Transformers

If the available transformers don’t cover your use case, you can build your own. See Creating a New Integration for details on implementing a custom transformer integration.

Next Steps

Last updated on