Skip to content

Conversation

@jpefaur
Copy link
Contributor

@jpefaur jpefaur commented Nov 3, 2025

What

Legacy Acceptance Tests were not working, so this PR contains a few different changes to make these tests green. Changes are either:

  • Disable tests that don't make sense anymore
  • Change the cdk testing framework to adjust for new behavior
  • Add missing features
    ...etc

There's a more thorough description of each of the issues addressed in this PR in this doc

@github-actions
Copy link
Contributor

github-actions bot commented Nov 3, 2025

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Helpful Resources

PR Slash Commands

Airbyte Maintainers (that's you!) can execute the following slash commands on your PR:

  • /format-fix - Fixes most formatting issues.
  • /bump-version - Bumps connector versions.
    • You can specify a custom changelog by passing changelog. Example: /bump-version changelog="My cool update"
    • Leaving the changelog arg blank will auto-populate the changelog from the PR title.
  • /run-cat-tests - Runs legacy CAT tests (Connector Acceptance Tests)
  • /build-connector-images - Builds and publishes a pre-release docker image for the modified connector(s).
  • JVM connectors:
    • /update-connector-cdk-version connector=<CONNECTOR_NAME> - Updates the specified connector to the latest CDK version.
      Example: /update-connector-cdk-version connector=destination-bigquery
    • /bump-bulk-cdk-version type=patch changelog='foo' - Bump the Bulk CDK's version. type can be major/minor/patch.
  • Python connectors:
    • /poe connector source-example lock - Run the Poe lock task on the source-example connector, committing the results back to the branch.
    • /poe source example lock - Alias for /poe connector source-example lock.
    • /poe source example use-cdk-branch my/branch - Pin the source-example CDK reference to the branch name specified.
    • /poe source example use-cdk-latest - Update the source-example CDK dependency to the latest available version.

📝 Edit this welcome message.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 3, 2025

destination-postgres Connector Test Results

0 tests  ±0   0 ✅ ±0   0s ⏱️ ±0s
0 suites ±0   0 💤 ±0 
0 files   ±0   0 ❌ ±0 

Results for commit 3a89ba0. ± Comparison against base commit bb34e41.

♻️ This comment has been updated with latest results.

constructor(timestamp: String) : this(OffsetDateTime.parse(timestamp))
override fun compareTo(other: TimestampWithTimezoneValue): Int = value.compareTo(other.value)
@JsonValue fun toJson() = value.toString()
@JsonValue fun toJson() = value.format(DateTimeFormatter.ISO_OFFSET_DATE_TIME)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise the string we output doesn't have seconds. Older versions of the connector do contain seconds.

return false
}

protected open fun disableRawTableComparison(): Boolean {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the default scenario is that there is a final table and there isn't a raw table.

@ValueSource(longs = [0L, 42L])
@Throws(Exception::class)
fun incrementalDedup(inputGenerationId: Long) {
open fun incrementalDedup(inputGenerationId: Long) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so that I can disable it

String finalTableName = getStreamNamespace() + "." + Names.toAlphanumericAndUnderscore(getStreamName());
getDatabase().execute("CREATE VIEW " + getStreamNamespace() + ".v1 AS SELECT * FROM " + rawTableName);
if (!disableRawTableComparison()) {
getDatabase().execute("CREATE VIEW " + getRawSchema() + ".v1 AS SELECT * FROM " + rawTableName);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that I changed the schema. I think this is the right thing to do since before this was only working because the final table schema existed. Now we can't assume that since there might not exist a final table.

@Disabled @ParameterizedTest @ValueSource(longs = []) override fun testIncrementalSyncDropOneColumn(inputGenerationId: Long) {}

// this test assumes that the fully qualified raw table name is lowercased.
// This was only a restriction in older versions of the connector.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really know why we had this restriction though

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes me nervous - is this saying that where previously we did create table "airbyte_internal"."blah_raw__stream_the_table", we're now doing airbyte_internal."blah_raw__stream_The_Table"? (assuming the stream name is The_Table)

wouldn't this require users to rewrite their SQL queries to target the mixed-case table?

(these stats are about a year old, but it looks like around 1/3 of cloud destination-postgres users have T+D disabled today, so that's pretty significant impact)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you read that right. This is in the current docs:

Airbyte Postgres destination will create raw tables and schemas using the Unquoted identifiers by replacing any special characters with an underscore. All final tables and their corresponding columns are created using Quoted identifiers preserving the case sensitivity. Special characters in final tables are replaced with underscores.

So in the current state of the new connector this would be changed to something like

For both final and raw tables, and their corresponding columns are created using Quoted identifiers preserving the case sensitivity.

So you are right, this will cause issues to existing queries. It sounds like we want to avoid this, I'll change it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I might need some help here. I thought this commit was going to make it so that both raw tables name and namespace were always lowercased, but the change didn't really do anything. It seems that still the raw table name and namespaces are created using the PostgresFinalTableNameGenerator. Any thoughts on what I could be missing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, I think this destination is doing the same thing as snowflake (where the "final table name generator" is also used for the raw tables) - you probably should just edit this thing

TypingDedupingUtil.concatenateRawTableName(
streamDescriptor.namespace ?: config.schema,
streamDescriptor.name
)
.toPostgresCompatibleName()

(also - the condition in the name generator is wrong. Compare snowflake's implementation - it should be if (!config.legacyRawTablesOnly), not if (config.internalTableSchema.isNullOrBlank()))

// This was only a restriction in older versions of the connector.
@Disabled @Test override fun testMixedCasedSchema() {}

// disabling dedup tests since dedup not supported when setting `disable_type_dedupe` to true.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that even though running dedup when disable_type_dedupe is set to true doesn’t make a ton of sense, it is a config that wouldn’t error out in older versions of the connector, but this will error out on the new version of the connector.


// older versions of the connector used to error out in this scenario. This is not
// true for newer versions of the connector
@Disabled @Test override fun interruptedTruncateWithPriorData() {}
Copy link
Contributor Author

@jpefaur jpefaur Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

truth be told, I don't 100% understand what this test is doing, I just know it used to fail and doesn't fail anymore. I'm assuming this is the behavior we want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 this is covered by new CDK tests

(specifically: the old CDK threw an error if we reached the end of stdin without seeing a "stream status" message; the new CDK only throws an error if we get an explicit INCOMPLETE stream status. In practice, platform always sends an explicit complete/incomplete status, so there's no behavior change at runtime)

@@ -1,4 +1,4 @@
{"_airbyte_extracted_at": "1970-01-01T00:00:01.000000Z", "_airbyte_data": {"id1": 1, "id2": 200, "old_cursor": 0, "_ab_cdc_deleted_at": null, "name" :"Alice", "address": {"city": "San Francisco", "state": "CA"}}, "_airbyte_meta": {"changes":[],"sync_id":42}, "_airbyte_generation_id": 43}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null values are not present on the raw table anymore. This is built into the bulk cdk

private val postgresColumnUtils: PostgresColumnUtils,
private val postgresConfiguration: PostgresConfiguration) {

private val dropTableSuffix: String = if (postgresConfiguration.dropCascade == true) "CASCADE" else ""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dropCascade is part of the connector config. I think we just forgot to add support for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch! We should probably eventually port the relevant test out of the legacy test suite, but not a blocker for this PR

(having the option to drop cascade is something that users really want, so we definitely should preserve it)

jsonObject.replace(key, value.toJson())
}
val jsonData = jsonObject.serializeToString()
val jsonData = Jsons.writeValueAsString(filteredRecord)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpefaur jpefaur changed the title Jose/fix legacy acceptance tests v2 fix legacy acceptance tests Nov 3, 2025
@jpefaur jpefaur changed the title fix legacy acceptance tests test: fix legacy acceptance tests Nov 3, 2025
@jpefaur jpefaur marked this pull request as ready for review November 3, 2025 22:40
@jpefaur jpefaur requested a review from a team as a code owner November 3, 2025 22:40
Copy link
Contributor

@jdpgrailsdev jdpgrailsdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit: LGTM, but maybe @edgao should take a look as he has more context.

@jpefaur jpefaur requested a review from edgao November 4, 2025 18:07
@jpefaur jpefaur force-pushed the jose/fix-legacy-acceptance-tests-v2 branch from 06be7bc to 7be35f5 Compare November 4, 2025 21:21
Copy link
Contributor

@edgao edgao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one question on the naming thing - sounds kind of similar to the snowflake uppercasing fire drill, but maybe I'm misunderstanding. Otherwise lgtm!

@Disabled @ParameterizedTest @ValueSource(longs = []) override fun testIncrementalSyncDropOneColumn(inputGenerationId: Long) {}

// this test assumes that the fully qualified raw table name is lowercased.
// This was only a restriction in older versions of the connector.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes me nervous - is this saying that where previously we did create table "airbyte_internal"."blah_raw__stream_the_table", we're now doing airbyte_internal."blah_raw__stream_The_Table"? (assuming the stream name is The_Table)

wouldn't this require users to rewrite their SQL queries to target the mixed-case table?

(these stats are about a year old, but it looks like around 1/3 of cloud destination-postgres users have T+D disabled today, so that's pretty significant impact)


// older versions of the connector used to error out in this scenario. This is not
// true for newer versions of the connector
@Disabled @Test override fun interruptedTruncateWithPriorData() {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 this is covered by new CDK tests

(specifically: the old CDK threw an error if we reached the end of stdin without seeing a "stream status" message; the new CDK only throws an error if we get an explicit INCOMPLETE stream status. In practice, platform always sends an explicit complete/incomplete status, so there's no behavior change at runtime)

private val postgresColumnUtils: PostgresColumnUtils,
private val postgresConfiguration: PostgresConfiguration) {

private val dropTableSuffix: String = if (postgresConfiguration.dropCascade == true) "CASCADE" else ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch! We should probably eventually port the relevant test out of the legacy test suite, but not a blocker for this PR

(having the option to drop cascade is something that users really want, so we definitely should preserve it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants