fix(oracle): Trigger sync invalid#20455
Conversation
Fix trigger metadata fields to ensure UTF-8 compliance.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4e66e49b47
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| Name: triggerName.String, | ||
| Body: constructTriggerBody(description.String, triggerBody.String), | ||
| Name: common.SanitizeUTF8String(triggerName.String), | ||
| Body: constructTriggerBody(common.SanitizeUTF8String(description.String), common.SanitizeUTF8String(triggerBody.String)), |
There was a problem hiding this comment.
Preserve trigger body line breaks when sanitizing
When an Oracle trigger body contains any invalid byte, SanitizeUTF8String takes its slow path and also hex-escapes control characters such as \n and \t (backend/common/util.go). Applying it directly to the SQL body here means multi-line triggers with one bad encoded character get stored with literal \x0a sequences instead of line breaks; in common cases like -- comments this changes the parsed trigger text and makes the generated DDL unusable even though the original problem was only invalid UTF-8. Consider sanitizing only invalid UTF-8 for SQL definitions, or preserving normal whitespace while escaping the bad bytes.
Useful? React with 👍 / 👎.
| Name: triggerName.String, | ||
| Body: constructTriggerBody(description.String, triggerBody.String), | ||
| Name: common.SanitizeUTF8String(triggerName.String), | ||
| Body: constructTriggerBody(common.SanitizeUTF8String(description.String), common.SanitizeUTF8String(triggerBody.String)), |
There was a problem hiding this comment.
One concern is TriggerMetadata.Body. It is SQL/PLSQL text, so using common.SanitizeUTF8String() directly may be too aggressive. When that function hits invalid UTF-8, it can also escape normal whitespace like newlines and tabs into \x0a / \x09, which may corrupt multiline trigger definitions.
I suggest keeping the current sanitization for Name, Timing, and Event, but using a narrower sanitizer for Body: replace invalid UTF-8 bytes while preserving normal SQL whitespace such as \n, \r, and \t.
It would also be good to add a regression test for invalid UTF-8 in trigger body, verifying that marshal succeeds and multiline formatting is preserved.
There was a problem hiding this comment.
View and Routine likely have the same issue.
Fix trigger metadata fields to ensure UTF-8 compliance.