Skip to content

Allow to sanitize spec file content before parsing with RPM#519

Merged
centosinfra-prod-github-app[bot] merged 2 commits into
mainfrom
sanitization
Mar 10, 2026
Merged

Allow to sanitize spec file content before parsing with RPM#519
centosinfra-prod-github-app[bot] merged 2 commits into
mainfrom
sanitization

Conversation

@nforro
Copy link
Copy Markdown
Member

@nforro nforro commented Mar 8, 2026

No description provided.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant security enhancement by implementing a sanitization mechanism for RPM spec file content. It aims to neutralize potentially dangerous shell expansions and impure Lua macros before the spec file is parsed by RPM, thereby safeguarding against arbitrary code execution and other vulnerabilities. The changes provide a configurable option within the Specfile class to enable this sanitization, ensuring that spec files can be processed more securely without altering their intended functional outcomes where safe conversions are possible.

Highlights

  • Spec File Sanitization: Introduced a new Sanitizer class that can convert various shell expansions and potentially unsafe Lua macros found in spec file content into safer RPM macros or Lua expressions. This helps prevent malicious code execution during spec file parsing.
  • Integration with Specfile Class: The Specfile class constructor now includes a sanitize parameter. When set to True, it automatically applies the sanitization logic to the spec file content before parsing, ensuring a safer parsing environment.
  • Comprehensive Shell Expansion Conversion: The Sanitizer supports converting a wide range of shell patterns, including substring extraction, bash string replacement, case conversion, suffix/prefix removal, pipes to tr, sed, awk, cut, herestrings, bash array field extraction, date formatting, arithmetic, basename/dirname operations, string comparisons, and printf truncations/float formatting.
  • Removal of Unsafe Constructs: The sanitization process also identifies and replaces explicitly unsafe Lua macros (e.g., those interacting with io, os, rpm internals, or load functions) and %include/%load directives with %{nil} to mitigate security risks.
  • New Test Infrastructure: Added a new test spec file (spec_unsafe/test.spec) containing various shell expansions and unsafe constructs, along with dedicated unit and integration tests to validate the correct functioning and effectiveness of the sanitization feature.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • _typos.toml
    • Added 'sover' to the list of allowed words to prevent false positives in spell checking.
  • specfile/sanitizer.py
    • Added a new module sanitizer.py containing the Sanitizer class.
    • Implemented sanitize_shell_expansion to convert various shell patterns into safe RPM macros or Lua expressions.
    • Implemented sanitize method to process spec file content, converting safe shell expansions and removing unsafe Lua macros and include/load directives.
  • specfile/specfile.py
    • Imported the new Sanitizer class.
    • Added a sanitize boolean parameter to the Specfile constructor.
    • Modified the internal _parse method to conditionally apply sanitization based on the sanitize flag.
    • Updated __repr__ to include the new sanitize parameter.
    • Ensured tainted, rpm_spec, expand, and get_active_macros properties/methods utilize the new _parse method for consistent sanitization application.
  • tests/constants.py
    • Added SPEC_UNSAFE constant pointing to the new test data directory for unsafe spec files.
  • tests/data/spec_unsafe/test.spec
    • Added a new test spec file containing various shell expansions and potentially unsafe constructs to test the sanitization logic.
  • tests/integration/conftest.py
    • Added a pytest fixture spec_unsafe to provide the path to the new unsafe spec file for integration tests.
  • tests/integration/test_specfile.py
    • Added test_sanitize to verify that the Specfile class correctly sanitizes content when the sanitize option is enabled, comparing expanded values with and without sanitization.
  • tests/unit/test_sanitizer.py
    • Added a new unit test file test_sanitizer.py.
    • Implemented extensive parameterized tests for sanitize_shell_expansion covering substring extraction, bash replacement, case conversion, suffix/prefix removal, pipes to cut, tr, sed, awk, herestrings, variable indirection, bash array fields, date formatting, arithmetic, basename/dirname, string comparisons, printf truncation, printf float formatting, and simple echo concatenations.
    • Added tests to confirm that unconvertible shell expansions return %{nil}.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to introduce a robust sanitization feature for spec files by adding a new Sanitizer class to handle potentially unsafe shell expansions and Lua macros. However, the Sanitizer class itself contains critical security vulnerabilities, including a Lua injection flaw that could lead to Remote Code Execution (RCE), a high-severity bypass of the unsafe Lua macro check, and a medium-severity macro injection vulnerability. These issues must be addressed to ensure the intended security benefits. Additionally, there are suggestions to improve the maintainability and clarity of the new sanitizer.py file.

Comment thread specfile/sanitizer.py Outdated
Comment thread specfile/sanitizer.py Outdated
Comment thread specfile/sanitizer.py Outdated
Comment thread specfile/sanitizer.py
@centosinfra-prod-github-app
Copy link
Copy Markdown
Contributor

@nforro
Copy link
Copy Markdown
Member Author

nforro commented Mar 8, 2026

/gemini review

@centosinfra-prod-github-app
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable security feature by adding a Sanitizer class for RPM spec file content. While the intention is to remove potentially unsafe constructs, the current implementation has critical flaws that allow for bypasses, potentially leading to Remote Code Execution (RCE). Specifically, unsafe Lua macros and directives like %include can be smuggled through conditional macro expansions or nested shell expansions, and the Lua blacklist is incomplete, bypassable with functions like loadstring. Beyond these critical security concerns, the review also identified recurring issues in regular expressions, suggested improvements for the main sanitizer function's structure for better maintainability, and noted a correction needed for a unit test assertion. Addressing the security bypasses through recursive sanitization and strengthened macro type checks is highly recommended.

Comment thread specfile/sanitizer.py Outdated
Comment thread specfile/sanitizer.py
Comment thread specfile/sanitizer.py Outdated
Comment thread specfile/sanitizer.py
Comment thread specfile/sanitizer.py
Comment thread tests/unit/test_sanitizer.py
@nforro
Copy link
Copy Markdown
Member Author

nforro commented Mar 8, 2026

/gemini review

@centosinfra-prod-github-app
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The PR introduces a sanitization feature for spec files, aiming to replace unsafe shell expansions and Lua macros with safer equivalents, thereby improving security when parsing untrusted spec files. However, the sanitizer implementation has critical security flaws, including bypasses and Remote Code Execution (RCE) due to improper handling of nested shell expansions and an incomplete/bypassable Lua macro denylist. Additionally, there are potential Python version compatibility issues in sanitizer.py and an incorrect unit test. These critical security flaws must be addressed before this feature can be safely used.

Comment thread specfile/sanitizer.py
Comment thread specfile/sanitizer.py Outdated
Comment thread specfile/sanitizer.py Outdated
Comment thread tests/unit/test_sanitizer.py
@nforro
Copy link
Copy Markdown
Member Author

nforro commented Mar 9, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable sanitization feature to enhance security when parsing RPM spec files by converting potentially unsafe shell expansions and Lua macros into safer equivalents. However, the Sanitizer class itself contains critical Lua injection vulnerabilities in the convert_glob_removal function due to insufficient escaping of user-supplied patterns, and the is_lua_safe check is bypassable due to incomplete regular expressions. These security flaws significantly undermine the intended benefits. Furthermore, there are critical issues with the handling of chained sed commands, which generates invalid nested Lua expressions, and significant bugs in the translation of sed regular expression patterns and replacement strings to their Lua counterparts.

Comment thread specfile/sanitizer.py
Comment thread specfile/sanitizer.py
Comment thread specfile/sanitizer.py Outdated
Comment thread specfile/sanitizer.py Outdated
Comment thread specfile/sanitizer.py
@centosinfra-prod-github-app
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Member

@lbarcziova lbarcziova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the overall design looks good to me - clean API and solid test coverage. I don't have more to add on the technicalities than what Gemini already flagged.

Comment thread specfile/sanitizer.py Outdated
@centosinfra-prod-github-app
Copy link
Copy Markdown
Contributor

@nforro
Copy link
Copy Markdown
Member Author

nforro commented Mar 9, 2026

/gemini review

@centosinfra-prod-github-app
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new sanitization mechanism to prevent unsafe shell expansions and Lua macros in spec files. However, the implementation of the Lua safety check is fundamentally flawed and can be easily bypassed using standard Lua syntax. Furthermore, the sanitizer fails to recursively process macros within shell expansions, allowing for macro injection that can lead to arbitrary code execution, posing a critical Remote Code Execution (RCE) risk. Additionally, there are suggestions for improving the implementation, including a bug fix for sed substitution handling and performance with chained commands, refactoring the long sanitize_shell_expansion method, and minor style improvements for regular expression definitions.

Comment thread specfile/sanitizer.py Outdated
Comment thread specfile/sanitizer.py
Comment thread specfile/sanitizer.py Outdated
Comment thread specfile/sanitizer.py
Comment thread specfile/sanitizer.py
@nforro
Copy link
Copy Markdown
Member Author

nforro commented Mar 9, 2026

/gemini review

@centosinfra-prod-github-app
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a powerful and important feature for sanitizing spec file content by replacing potentially unsafe shell expansions and Lua macros. However, the implementation of the Sanitizer class contains critical security vulnerabilities that allow for Command Injection and Remote Code Execution (RCE). Specifically, sanitize_shell_expansion fails to validate several input components before embedding them into generated Lua code, leading to recursive macro expansion vulnerabilities, and is_lua_safe can be bypassed. Additionally, there is a suggestion for improvement in specfile/sanitizer.py to extend the support for the cut command's field selection syntax.

Comment thread specfile/sanitizer.py
Comment thread specfile/sanitizer.py
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new feature to sanitize spec files, converting potentially unsafe constructs like shell expansions and impure Lua macros into safer equivalents. However, a critical vulnerability exists in the Lua sandbox implementation within is_lua_safe. This can be bypassed using string concatenation and aliasing, potentially leading to Remote Code Execution (RCE) when processing malicious spec files. Specifically, dangerous RPM macros can be constructed at runtime and expanded by the RPM engine. Apart from this issue, the implementation appears robust and well-executed.

Comment thread specfile/sanitizer.py
Comment thread specfile/sanitizer.py
@centosinfra-prod-github-app
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant security enhancement by adding a sanitization feature for spec file content, utilizing a new Sanitizer class to convert potentially unsafe shell expansions and Lua macros. While the security review could not provide specific vulnerability feedback due to malformed input, the code review identified a minor bug in a regular expression and suggested refactoring for a large function to improve maintainability. The implementation is well-supported by a thorough test suite.

Comment thread specfile/sanitizer.py Outdated
Comment thread specfile/sanitizer.py
@centosinfra-prod-github-app
Copy link
Copy Markdown
Contributor

@nforro
Copy link
Copy Markdown
Member Author

nforro commented Mar 9, 2026

/gemini review

@centosinfra-prod-github-app
Copy link
Copy Markdown
Contributor

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a robust sanitizer for RPM spec files, aiming to prevent arbitrary code execution through shell expansions and unsafe Lua macros by converting common shell command patterns into safer RPM macros or Lua expressions. However, a critical Command Injection vulnerability has been identified in the Lua safety validation logic. The sanitizer strips Lua comments before identifying string literals, which allows an attacker to hide malicious code that will still be executed by the Lua interpreter, leading to arbitrary code execution. Additionally, there are limitations in the regex-based parsing of shell commands (sed, awk, tr) that could be improved, and one check in the Lua safety validation might be overly aggressive. It is recommended to address the Command Injection vulnerability by using a proper Lua lexer or ensuring string literals are handled before comment stripping.

Comment thread specfile/sanitizer.py Outdated
Comment thread specfile/sanitizer.py
Comment thread specfile/sanitizer.py
Comment thread specfile/sanitizer.py
Comment thread specfile/sanitizer.py
Comment thread specfile/sanitizer.py
nforro and others added 2 commits March 9, 2026 22:11
Assisted-by: Claude Opus 4.6 via Cursor
Signed-off-by: Nikola Forró <nforro@redhat.com>
Co-authored-by: Laura Barcziová <49026743+lbarcziova@users.noreply.github.com>
Signed-off-by: Nikola Forró <nforro@redhat.com>
@centosinfra-prod-github-app
Copy link
Copy Markdown
Contributor

@nforro
Copy link
Copy Markdown
Member Author

nforro commented Mar 9, 2026

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a sanitizer to process spec files and remove potentially unsafe constructs like shell expansions and impure Lua macros. This is a significant security enhancement. The implementation is comprehensive, covering a wide range of shell patterns and Lua safety checks. The accompanying tests are thorough and cover many edge cases and bypass attempts. I have a couple of suggestions to improve the maintainability of the new sanitizer module.

Comment thread specfile/sanitizer.py
Comment on lines +54 to +58
_RE_BASH_ARRAY_FIELD = re.compile(
r"^(\w+)\s*=\s*[\"']?(%\{?\w+\}?)[\"']?\s*;\s*"
r"(\w+)=\(\$\{\1//(.)/ \}\)\s*;\s*"
r"echo\s+\$\{\3\[(\d+)\]\}\s*$"
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The regular expressions in this file are quite complex. To improve readability and maintainability, especially for security-critical code like this, consider using the re.VERBOSE flag. This would allow you to add comments to explain the different parts of the regex. For example, this regex could be rewritten as:

_RE_BASH_ARRAY_FIELD = re.compile(
    r"""
    ^
    (\w+)\s*=\s*["']?(%\{?\w+\}?)["']?\s*;\s*  # var1=MACRO;
    (\w+)=\(\$\{\1//(.)/ \}\)\s*;\s*      # var2=(${var1//./ })
    echo\s+\$\{\3\[(\d+)\]\}\s*                  # echo ${var2[idx]}
    $
    """,
    re.VERBOSE,
)

This would make it much easier to understand what the regex is matching at a glance. This can be applied to other complex regexes in this file as well.

Comment thread specfile/sanitizer.py

class Sanitizer:
@staticmethod
def sanitize_shell_expansion(body: str) -> str:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The sanitize_shell_expansion method is very long (around 400 lines) and contains many nested helper functions. To improve readability and maintainability, consider refactoring it. The nested helper functions like strip_quotes, normalize_macro, sed_pattern_to_lua, parse_sed_substs, parse_cut, etc., could be moved out of sanitize_shell_expansion and become top-level private functions within the module (e.g., _parse_cut). This would make sanitize_shell_expansion shorter and easier to follow, as it would primarily be a dispatcher that calls these helper functions based on which regex matches.

@nforro
Copy link
Copy Markdown
Member Author

nforro commented Mar 9, 2026

/packit test

@nforro
Copy link
Copy Markdown
Member Author

nforro commented Mar 9, 2026

/packit-stg test

@nforro
Copy link
Copy Markdown
Member Author

nforro commented Mar 9, 2026

/packit retest-failed

@nforro
Copy link
Copy Markdown
Member Author

nforro commented Mar 9, 2026

/packit-stg retest-failed

@nforro
Copy link
Copy Markdown
Member Author

nforro commented Mar 9, 2026

/packit retest-failed

@nforro nforro added the mergeit Merge via Zuul label Mar 9, 2026
@nforro nforro moved this from New to In review in Packit pull requests Mar 9, 2026
@nforro
Copy link
Copy Markdown
Member Author

nforro commented Mar 10, 2026

/packit-stg test

@nforro
Copy link
Copy Markdown
Member Author

nforro commented Mar 10, 2026

regate

@centosinfra-prod-github-app
Copy link
Copy Markdown
Contributor

@centosinfra-prod-github-app centosinfra-prod-github-app Bot merged commit 8c3d7fe into main Mar 10, 2026
50 of 51 checks passed
@github-project-automation github-project-automation Bot moved this from In review to Done in Packit pull requests Mar 10, 2026
@nforro nforro deleted the sanitization branch March 10, 2026 06:36
nforro added a commit to packit/packit that referenced this pull request Mar 11, 2026
centosinfra-prod-github-app Bot added a commit to packit/packit-service that referenced this pull request Mar 11, 2026
Sanitize spec files

Merge after packit/specfile#519.

Reviewed-by: Laura Barcziová
Reviewed-by: gemini-code-assist[bot]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

mergeit Merge via Zuul

Projects

Development

Successfully merging this pull request may close these issues.

3 participants