Harden data-update-partners workflow#270
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughUpdated a GitHub Actions workflow and added a new CLI script. Workflow changes: tightened shell options ( Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
.github/workflows/data-update-partners-data.yml (1)
119-122: Consider whether maintainer fetch failure should be handled gracefully.With
set -euo pipefail, ifcurl --failfails here, the script exits immediately without reaching the validation at lines 163-168. If a temporary API issue occurs, the workflow fails hard rather than potentially using cached data or providing a clear error.If the intent is hard failure on maintainer fetch issues (reasonable), this is fine. If graceful handling is preferred, wrap in a similar pattern as suggested for the partner fetch.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/data-update-partners-data.yml around lines 119 - 122, The maintainer fetch using the curl + jq pipeline (the Contacts search call) will cause an immediate hard exit under set -euo pipefail if curl --fail fails; either decide to keep hard failure or make it graceful like the partner fetch: run the curl command in a guarded block that captures stdout/stderr (or return code) from the Contacts search call, check its exit status before piping to jq, and on failure log a clear warning and fall back to cached maintainer data (or skip/continue) instead of letting the script exit; update the block that invokes curl and jq for Contacts (the maintainer fetch) to mirror the partner-fetch error handling pattern so failures are handled deterministically.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/data-update-partners-data.yml:
- Around line 57-64: The script uses set -euo pipefail so a failing curl exits
the shell before the subsequent "if [ $? -ne 0 ]" can run; replace the current
pattern where you capture curl into response with a conditional capture (e.g.,
run curl as part of an if/else or use "if ! response=$(curl ...); then ...; fi")
so failures are handled, logging the warning and continuing rather than letting
set -e terminate the job; update the block that contains "response=$(curl
--silent --fail --request GET ...)" and the following "if [ $? -ne 0 ]" to this
conditional-style error handling and keep the existing echo "::warning::Failed
to fetch ${partner_type} partners" and continue behavior.
---
Nitpick comments:
In @.github/workflows/data-update-partners-data.yml:
- Around line 119-122: The maintainer fetch using the curl + jq pipeline (the
Contacts search call) will cause an immediate hard exit under set -euo pipefail
if curl --fail fails; either decide to keep hard failure or make it graceful
like the partner fetch: run the curl command in a guarded block that captures
stdout/stderr (or return code) from the Contacts search call, check its exit
status before piping to jq, and on failure log a clear warning and fall back to
cached maintainer data (or skip/continue) instead of letting the script exit;
update the block that invokes curl and jq for Contacts (the maintainer fetch) to
mirror the partner-fetch error handling pattern so failures are handled
deterministically.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 2e706e75-420a-4e9c-865e-13b5dfa7ae3f
⛔ Files ignored due to path filters (1)
board-images/nanopct6ltsplus.pngis excluded by!**/*.png
📒 Files selected for processing (1)
.github/workflows/data-update-partners-data.yml
cb386a5 to
2d449b3
Compare
Add error handling, input validation, and safety checks to prevent empty data from overwriting good data. Add Last_Name field support for maintainers, falling back to Last_Name when First_Name is empty.
Use conditional capture (if ! response=$(curl ...)) instead of $? check, which was unreachable under set -e.
2d449b3 to
cb8765f
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
.github/workflows/data-update-partners-data.yml (1)
127-133:⚠️ Potential issue | 🟠 MajorMake avatar lookup genuinely best-effort.
This block has the same strict-mode trap as the old partner fetch. If the
curl | jqassignment returns non-zero, Bash exits the shell before Lines 130-133 can blank the avatar, so one transient GitHub API failure aborts the whole maintainer sync instead of warning and continuing. (gnu.org)Proposed fix
- avatar_url=$(curl -s -H "Authorization: token $GH_TOKEN" "https://api.github.com/users/$username" | jq -r '.avatar_url') - - # Fall back to empty string if avatar fetch failed - if [ "$avatar_url" = "null" ] || [ -z "$avatar_url" ]; then + if avatar_url=$(curl --silent --show-error --fail \ + -H "Authorization: token $GH_TOKEN" \ + "https://api.github.com/users/$username" \ + | jq -er '.avatar_url // empty'); then + : + else echo "::warning::Could not fetch avatar for $username" avatar_url="" fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/data-update-partners-data.yml around lines 127 - 133, The avatar lookup can cause the script to exit under set -e if the curl|jq pipeline fails; change the assignment to make the lookup genuinely best-effort by swallowing non-zero exit from the pipeline—for example update the assignment for avatar_url to run the pipeline with a trailing "|| true" inside the command substitution (e.g., avatar_url=$(curl -s -H "Authorization: token $GH_TOKEN" "https://api.github.com/users/$username" | jq -r '.avatar_url' || true)) and keep the existing null/empty check and fallback that sets avatar_url="" and emits the warning; reference variable avatar_url and the curl|jq pipeline in your change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In @.github/workflows/data-update-partners-data.yml:
- Around line 127-133: The avatar lookup can cause the script to exit under set
-e if the curl|jq pipeline fails; change the assignment to make the lookup
genuinely best-effort by swallowing non-zero exit from the pipeline—for example
update the assignment for avatar_url to run the pipeline with a trailing "||
true" inside the command substitution (e.g., avatar_url=$(curl -s -H
"Authorization: token $GH_TOKEN" "https://api.github.com/users/$username" | jq
-r '.avatar_url' || true)) and keep the existing null/empty check and fallback
that sets avatar_url="" and emits the warning; reference variable avatar_url and
the curl|jq pipeline in your change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ae65d4d2-8c1d-4fe1-be78-c7ec67a0cb68
📒 Files selected for processing (1)
.github/workflows/data-update-partners-data.yml
Call days_since_last_commit.py per maintainer to enrich the JSON with a Days_since_last_commit field. Add --days-only flag to the script for machine-readable output.
Use GitHub GraphQL API to fetch only pinned repositories instead of all org repos, reducing API calls and avoiding errors from large or problematic repositories.
Add exponential backoff retry (up to 3 attempts) for 500+ errors when fetching commit dates, silently returning None if all retries are exhausted.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
scripts/days_since_last_commit.py (2)
88-96: Token passed via CLI argument is visible in process listings.Passing the token as a positional argument exposes it in
psoutput. In GitHub Actions this is generally acceptable since the environment is ephemeral, but for local/reusable scenarios, reading from an environment variable (e.g.,GITHUB_TOKEN) would be more secure.♻️ Optional: support environment variable fallback
+import os + def main(): days_only = "--days-only" in sys.argv args = [a for a in sys.argv[1:] if a != "--days-only"] - if len(args) != 3: - print("Usage: script.py [--days-only] ORG USER TOKEN") + if len(args) < 2 or len(args) > 3: + print("Usage: script.py [--days-only] ORG USER [TOKEN]") + print(" TOKEN defaults to $GITHUB_TOKEN if not provided") sys.exit(1) - org, user, token = args[0], args[1], args[2] + org, user = args[0], args[1] + token = args[2] if len(args) == 3 else os.environ.get("GITHUB_TOKEN", "") + if not token: + print("Error: TOKEN required (argument or $GITHUB_TOKEN)") + sys.exit(1) delta_days, latest_repo = days_since_last_commit(org, user, token)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/days_since_last_commit.py` around lines 88 - 96, The CLI currently reads the token as a positional argument in main(), exposing it in process listings; change parsing so the GitHub token is read from an environment variable (e.g., GITHUB_TOKEN) instead of a positional arg: update main() to accept only ORG and USER as required positional args (or make TOKEN optional) and fetch token = os.environ.get("GITHUB_TOKEN") when not provided on the CLI, and if token is still missing print the usage indicating to set GITHUB_TOKEN and exit; update the usage message and any references to the token variable to reflect the env fallback to avoid passing secrets via process args.
78-79: Narrow the exception type for clearer error handling.Catching a bare
Exceptionsuppresses unexpected errors (e.g., programming bugs). Since you're only expecting HTTP or JSON issues here, narrow the scope.♻️ Proposed refinement
- except Exception as e: + except (requests.RequestException, KeyError, ValueError) as e: print(f"[WARN] {name}: {e}", file=sys.stderr)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/days_since_last_commit.py` around lines 78 - 79, Replace the broad "except Exception as e" in the block that prints "[WARN] {name}: {e}" with a narrowed exception tuple that targets expected failures (e.g., requests.exceptions.RequestException and json.JSONDecodeError) so only HTTP/JSON errors are caught; update the except clause to "except (requests.exceptions.RequestException, json.JSONDecodeError) as e" (and add the necessary imports if missing) and leave other unexpected exceptions to propagate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/data-update-partners-data.yml:
- Around line 120-124: The maintainers fetch curl command that calls
"https://www.zohoapis.eu/bigin/v2/Contacts/search?..." using Authorization:
Zoho-oauthtoken $ACCESS_TOKEN currently runs under set -euo pipefail and will
cause an immediate hard failure on network/HTTP errors; either make its behavior
match the partners fetch by capturing the curl/jq exit status and emit a warning
+ continue (e.g., check curl exit code, log a warning and skip processing the
maintainers loop) or intentionally document the hard-fail design with an inline
comment above the curl explaining why maintainers must fail fast; update the
block that pipes to jq and the while-read loop (the "maintainers fetch"
pipeline) to implement the chosen behavior.
In `@scripts/days_since_last_commit.py`:
- Around line 44-47: The code unconditionally indexes into the GraphQL response
(data["data"]["organization"]["pinnedItems"]["nodes"]) which can raise KeyError
if the API returns errors or the organization is missing; after data = r.json()
(in the function handling this request), validate the response by checking for
an "errors" key and that data.get("data") and data["data"].get("organization")
and the nested "pinnedItems" and "nodes" keys exist and are not None, and if not
raise or log a clear exception/message that includes the raw response (or error
details) instead of letting a KeyError propagate; update the nodes extraction
and the return path to handle an empty/missing nodes list gracefully.
---
Nitpick comments:
In `@scripts/days_since_last_commit.py`:
- Around line 88-96: The CLI currently reads the token as a positional argument
in main(), exposing it in process listings; change parsing so the GitHub token
is read from an environment variable (e.g., GITHUB_TOKEN) instead of a
positional arg: update main() to accept only ORG and USER as required positional
args (or make TOKEN optional) and fetch token = os.environ.get("GITHUB_TOKEN")
when not provided on the CLI, and if token is still missing print the usage
indicating to set GITHUB_TOKEN and exit; update the usage message and any
references to the token variable to reflect the env fallback to avoid passing
secrets via process args.
- Around line 78-79: Replace the broad "except Exception as e" in the block that
prints "[WARN] {name}: {e}" with a narrowed exception tuple that targets
expected failures (e.g., requests.exceptions.RequestException and
json.JSONDecodeError) so only HTTP/JSON errors are caught; update the except
clause to "except (requests.exceptions.RequestException, json.JSONDecodeError)
as e" (and add the necessary imports if missing) and leave other unexpected
exceptions to propagate.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: edbba13b-95cf-45f8-a04b-09fe278af8c0
📒 Files selected for processing (2)
.github/workflows/data-update-partners-data.ymlscripts/days_since_last_commit.py
Replace per-repo iteration with a single search API call per
maintainer (author:{user} org:{org}), drastically reducing API
usage and eliminating server error warnings.
Fix Accept header, add retry with Retry-After for 403 responses, handle 422 validation errors, and return None instead of crashing when all retries are exhausted.
Display First_Name and Last_Name alongside username in processing logs. Skip entries with null/empty GitHub field with a warning showing the maintainer's name.
Increase retry attempts to 5 with longer backoff (10s, 20s, 40s...), and add 2s delay between maintainer lookups to avoid triggering GitHub search API secondary rate limits.
Search across commits, opened issues/PRs, and comments to find the most recent activity date per maintainer. Rename JSON field to Days_since_last_activity. Add 1s delay between search API calls.
The commenter search used issue updated_at which reflects when anyone last touched the issue, not when the user commented, giving false recent activity. Only track commits and opened issues/PRs which have reliable dates.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/data-update-partners-data.yml:
- Around line 147-158: The workflow currently treats days_since == "-1" (from
armbian.github.io/scripts/days_since_last_commit.py) as "no activity" but that
value also signals lookup failures; update the helper
(days_since_last_commit.py) to return a distinct error signal (e.g., exit
non-zero or print a different sentinel like "ERROR") on failure, and then change
the workflow logic that reads the days_since variable to check for that error
signal first (and fail or skip the update) before mapping "-1" to "no activity";
ensure the jq enrichment step (. + {Avatar: $avatar, Days_since_last_activity:
($days | tonumber)}) only receives numeric values or is skipped when an error
sentinel is present.
- Around line 56-62: The workflow currently treats curl failures in the for loop
over partner_type (Platinum Gold Silver) as warnings and continues, but the
subsequent `length > 0` check can still pass and cause a partial `partners.json`
to be published; change the logic in the loop that sets `response` (where the
curl call is made for each partner_type) to fail the job on any curl error
(e.g., exit non-zero or set a flag and break) so that a transient fetch error
prevents publishing, or alternatively ensure `response` is explicitly
unset/empty on failure so the later `length > 0` checks correctly detect missing
data; apply the same fix to the duplicate block at lines handling the other
partner type fetch (the second curl loop referenced in the comment).
- Around line 138-145: The avatar lookup can cause the whole step to fail under
set -euo pipefail; make the lookup best-effort by running the curl|jq pipeline
in a way that its failures don't trigger exit (e.g., append || true or capture
output via a subshell that returns success) and then validate avatar_url as
before; update the use of avatar_url (the variable produced from curl -H
"Authorization: token $GH_TOKEN" ... | jq -r '.avatar_url') so any curl/jq error
yields an empty string rather than a non-zero exit status, preserving the
existing null/empty checks and warning log.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 723acd88-76d2-462d-97e8-eae7971ef57b
📒 Files selected for processing (2)
.github/workflows/data-update-partners-data.ymlscripts/days_since_last_commit.py
🚧 Files skipped from review as they are similar to previous changes (1)
- scripts/days_since_last_commit.py
Return ERROR sentinel and exit 1 on API failures (rate limit exhaustion, HTTP errors, network errors). Reserve -1 for genuine no-activity. Workflow catches ERROR and falls back to -1 with a warning instead of crashing.
Summary
databranchLast_Namefield support for maintainers — falls back toLast_NamewhenFirst_Nameis empty--failto curl callsfetch-depthto 1-bordersuffix from partner logo URLsTest plan