Transformations
Not all data needs to land in your destination exactly as it comes from the source. You might need to strip sensitive columns before they reach the database, rename tables to match your naming conventions, or build aggregate views for reporting after the data lands. Transformations handle both of these scenarios.
There are two categories, depending on when they run relative to data being written to your destination:
- Pre-load transformations - transformer integrations that modify data in flight, between the source and destination.
- Post-load transformations - dbt packages that run after data has landed in your destination to create analytics-ready views.
All transformations are available in CloudQuery Hub.
Pre-Load Transformations
Pre-load transformations are transformer integrations that sit between a source and destination in your sync pipeline. They intercept records after they leave the source and modify them before they are written to the destination. This is useful for shaping data on the fly - renaming tables, removing sensitive columns, or filtering out unwanted rows - without touching the source or destination configuration.
Transformer integrations are configured with kind: transformer in your CloudQuery configuration. You reference them in your destination spec under the transformers key, and the CLI chains them automatically during a sync.
For full configuration details, see the Transformer Integrations page.
Available Pre-Load Transformers
| Transformer | Description |
|---|---|
| Basic | Rename tables, add or remove columns, obfuscate sensitive data, normalize casing, drop rows by value, and add timestamps or literal columns. |
| JSON Flattener | Flatten single-level JSON object fields into typed destination columns while preserving the original JSON column. |
Example: Using the Basic Transformer
The following configuration adds the Basic transformer to a PostgreSQL destination. It renames all tables with a cq_sync_ prefix and obfuscates sensitive columns:
kind: destination
spec:
name: "postgresql"
path: "cloudquery/postgresql"
registry: "cloudquery"
version: "v8.0.7"
write_mode: "overwrite-delete-stale"
migrate_mode: forced
transformers:
- "basic"
spec:
connection_string: "postgresql://your.user:your.password@localhost:5432/db_name"
---
kind: transformer
spec:
name: "basic"
path: "cloudquery/basic"
registry: "cloudquery"
version: "VERSION_TRANSFORMER_BASIC"
spec:
transformations:
- kind: obfuscate_sensitive_columns
- kind: change_table_names
tables: ["*"]
new_table_name_template: "cq_sync_{{.OldName}}"Basic Transformer Capabilities
The Basic transformer supports the following transformation kinds:
remove_columns- Remove columns from specified tables. Supports JSON paths for nested fields.add_column- Add a literal string column to specified tables.add_current_timestamp_column- Add a column with the timestamp the record was processed.add_primary_keys- Add additional primary key columns.obfuscate_columns- Replace column values with a redacted placeholder. Supports JSON paths including array syntax (#.field).obfuscate_sensitive_columns- Automatically obfuscate all columns the source has marked as sensitive.change_table_names- Rename tables using a Go template ({{.OldName}}).rename_column- Rename a single column on specified tables.uppercase/lowercase- Normalize column values to upper or lower case.drop_rows- Drop rows where a column matches a given value.
Note: Transformations are applied sequentially. If you rename tables, subsequent transformations must reference the new table names.
Post-Load Transformations
Post-load transformations are dbt packages that run after data has been synced to your destination. They create analytics-ready models, views, and dashboards on top of the raw synced data. These are useful for asset inventory reporting, cost analysis, and backup compliance - scenarios where you want to aggregate, join, or reshape data that has already landed.
Post-load transformations are standalone dbt projects. You install them with dbt, configure a profile for your destination database, and run them with dbt run after a CloudQuery sync completes.
Available Post-Load Transformations
| Transformation | Description |
|---|---|
| AWS Asset Inventory | Automated line-item listing of all active resources in your AWS environment. Creates an aws_resources view with ARN, account, region, tags, and more. |
| AWS Cost | Analyze AWS expenses across services and accounts. Provides cost breakdowns for budgeting and optimization. |
| AWS Data Resilience (Backup) | Evaluate backup and disaster recovery coverage across your AWS resources. |
| Azure Asset Inventory | Catalog Azure resources across subscriptions with a unified resource view. |
| GCP Asset Inventory | Catalog Google Cloud resources across projects with a unified resource view. |
These transformations can be paired with BI tools like Grafana, Apache Superset, QuickSight, PowerBI, and others to visualize and monitor cloud infrastructure.
Creating Custom Transformers
If the available transformers don’t cover your use case, you can build your own. See Creating a New Integration for details on implementing a custom transformer integration.
Next Steps
- Transformer integration reference - full configuration spec for pre-load transformer integrations
- Dashboards - visualize transformation output with Grafana dashboards and BI tools
- Configuration - set up source, destination, and transformer configuration files
- Browse all transformations on the Hub - find pre-load and post-load transformations
- Creating a new integration - build a custom transformer integration