feat: add support for Co-authored-by commit trailers with username correlation#373
feat: add support for Co-authored-by commit trailers with username correlation#373
Conversation
Co-authored-by: zkoppert <6935431+zkoppert@users.noreply.github.com>
Co-authored-by: zkoppert <6935431+zkoppert@users.noreply.github.com>
|
@copilot instead of having ACKNOWLEDGE_COAUTHORS default to False, lets change it to defaulting to true. I think that users will want that to work out of the box. Make sure to update that in the code and documentation as well. |
Co-authored-by: zkoppert <6935431+zkoppert@users.noreply.github.com>
Changed default to True in commit 9798816. Updated:
|
|
@copilot instead of only allowing specific email domains for coauthors, please change the code and documentation to allow any email domain to be recognized as a coauthor. For example: right now john@example.com is not counted as a coauthor but id like that functionality changed so it is included. |
Co-authored-by: zkoppert <6935431+zkoppert@users.noreply.github.com>
Co-authored-by: zkoppert <6935431+zkoppert@users.noreply.github.com>
Changed in commit 211543e. Now any email domain is recognized as a co-author:
|
|
Testing locally today |
|
Notes from testing:
|
|
@copilot Can you figure out how to make sure that the email specified as the co-author is correlated with its github username and added to the stats in the table for that user? |
Implemented in commit ee109e5. The solution:
This addresses all three issues from testing:
|
- Add support for extracting usernames from @github.com email addresses - Integrate GitHub Search Users API to resolve arbitrary emails to usernames - Add fallback behavior when API calls fail or return no results - Thread github_connection parameter through co-author parsing functions - Remove unused existing_usernames parameter from get_coauthor_contributors - Update all function signatures and test cases with new parameters - Improve co-author identification from noreply, GitHub, and external emails This significantly improves co-author recognition by automatically resolving multiple email formats to GitHub usernames, providing more accurate contributor attribution and better coverage of collaborative contributions. Signed-off-by: Zack Koppert <zkoppert@github.com>
ee109e5 to
1fabe84
Compare
Signed-off-by: Zack Koppert <zkoppert@github.com>
There was a problem hiding this comment.
Pull request overview
This pull request adds support for acknowledging co-authors from commit messages as contributors via a new ACKNOWLEDGE_COAUTHORS configuration option (default: True). The implementation parses Co-authored-by: trailers from commit messages and includes those users in the contributor list, with intelligent handling for GitHub noreply emails, @github.com emails, and a fallback to the GitHub Search Users API for other email addresses.
Key Changes
- Added
ACKNOWLEDGE_COAUTHORSenvironment variable with default value ofTrue - Implemented
get_coauthors_from_message()to parse co-author trailers and extract usernames from various email formats - Added
get_coauthor_contributors()to aggregate co-authors across commits and create contributor stats - Updated all relevant functions to pass the new configuration option through the call chain
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
.env-example |
Added ACKNOWLEDGE_COAUTHORS configuration option with default value |
env.py |
Added environment variable reading for ACKNOWLEDGE_COAUTHORS with default True and updated return tuple |
contributors.py |
Implemented core co-author extraction and integration logic including regex parsing, email-to-username conversion, and GitHub API lookup fallback |
test_env.py |
Updated tests to handle the new acknowledge_coauthors parameter in return tuples |
test_contributors.py |
Added comprehensive test suite for co-author functions covering various email formats, edge cases, and integration scenarios |
README.md |
Added documentation for the new ACKNOWLEDGE_COAUTHORS option and co-author behavior including performance implications |
| all_repo_contributors = repo.contributors() | ||
| contributors = [] | ||
| # Track usernames already added as contributors | ||
| contributor_usernames = set() |
There was a problem hiding this comment.
The contributor_usernames set is created and populated at line 280 but never actually used. If the intention was to prevent duplicate contributors when a user is both a regular contributor and a co-author in the same repository, the set should be used to filter co-authors. Otherwise, this variable should be removed as it serves no purpose.
| ghe, | ||
| github_connection, | ||
| ) | ||
| contributors.extend(coauthor_contributors) |
There was a problem hiding this comment.
When acknowledge_coauthors is True and a user is both a regular contributor and a co-author on some commits in the same repository, they will appear twice in the results list. While the merge_contributors function will later merge these duplicates, this creates unnecessary processing. Consider checking if a username already exists in contributor_usernames before adding co-authors to avoid creating duplicates within a single repository.
| contributors.extend(coauthor_contributors) | |
| # Avoid adding duplicate contributors for the same username within this repository | |
| filtered_coauthors = [] | |
| for coauthor in coauthor_contributors: | |
| username = getattr(coauthor, "username", None) or getattr( | |
| coauthor, "login", None | |
| ) | |
| if username and username not in contributor_usernames: | |
| filtered_coauthors.append(coauthor) | |
| contributor_usernames.add(username) | |
| contributors.extend(filtered_coauthors) |
| # For other emails, try to find GitHub username using Search Users API | ||
| if github_connection: | ||
| try: | ||
| # Search for users by email | ||
| search_result = github_connection.search_users(f"email:{email}") | ||
| if search_result.totalCount > 0: | ||
| # Use the first matching user's login | ||
| identifiers.append(search_result[0].login) | ||
| else: | ||
| # If no user found, fall back to email address | ||
| identifiers.append(email) | ||
| except Exception: | ||
| # If API call fails, fall back to email address | ||
| identifiers.append(email) |
There was a problem hiding this comment.
The GitHub Users Search API call at line 204 may experience rate limiting when processing repositories with many co-authors using non-GitHub email addresses. Each unique email requires an API call, which could significantly impact performance and potentially exhaust API rate limits. Consider adding rate limit handling or caching of email-to-username mappings to mitigate this issue.
| elif email.endswith("@github.com"): | ||
| # For @github.com emails, extract the username (part before @) | ||
| username = email.split("@")[0] | ||
| identifiers.append(username) |
There was a problem hiding this comment.
Test coverage is missing for the @github.com email handling path. Add a test case that verifies co-authors with @github.com email addresses are correctly parsed to extract the username before the @ symbol.
| # For other emails, try to find GitHub username using Search Users API | ||
| if github_connection: | ||
| try: | ||
| # Search for users by email | ||
| search_result = github_connection.search_users(f"email:{email}") | ||
| if search_result.totalCount > 0: | ||
| # Use the first matching user's login | ||
| identifiers.append(search_result[0].login) | ||
| else: | ||
| # If no user found, fall back to email address | ||
| identifiers.append(email) | ||
| except Exception: | ||
| # If API call fails, fall back to email address | ||
| identifiers.append(email) |
There was a problem hiding this comment.
Test coverage is missing for the GitHub Search Users API fallback path (lines 200-213). Add test cases that verify: 1) successful username lookup via the search API when a matching user is found, 2) fallback to email address when no user is found (totalCount = 0), and 3) fallback to email address when the API call raises an exception.
| | `ACKNOWLEDGE_COAUTHORS` | False | True | If you want to include co-authors from commit messages as contributors. Co-authors are identified via the `Co-authored-by:` trailer in commit messages. The action will extract GitHub usernames from GitHub noreply emails (e.g., `username@users.noreply.github.com`) or use the full email address for other email domains. This will impact action performance as it requires scanning all commits. ie. ACKNOWLEDGE_COAUTHORS = "True" or ACKNOWLEDGE_COAUTHORS = "False" | | ||
|
|
||
| **Note**: If `start_date` and `end_date` are specified then the action will determine if the contributor is new. A new contributor is one that has contributed in the date range specified but not before the start date. | ||
|
|
||
| **Performance Note:** Using start and end dates will reduce speed of the action by approximately 63X. ie without dates if the action takes 1.7 seconds, it will take 1 minute and 47 seconds. | ||
|
|
||
| **Co-authors Note:** When `ACKNOWLEDGE_COAUTHORS` is enabled, the action will scan commit messages for `Co-authored-by:` trailers and include those users as contributors. For GitHub noreply email addresses (e.g., `username@users.noreply.github.com`), the username will be extracted. For other email addresses (e.g., `john@example.com`), the full email address will be used as the contributor identifier. See [GitHub's documentation on creating commits with multiple authors](https://docs.github.com/en/pull-requests/committing-changes-to-your-project/creating-and-editing-commits/creating-a-commit-with-multiple-authors). |
There was a problem hiding this comment.
The documentation is incomplete. It mentions that GitHub noreply emails extract the username and other emails use the full email address, but it doesn't document that @github.com email addresses also extract the username (part before @), or that the action attempts to use the GitHub Search Users API to find usernames for other email addresses before falling back to the email address.
| | `ACKNOWLEDGE_COAUTHORS` | False | True | If you want to include co-authors from commit messages as contributors. Co-authors are identified via the `Co-authored-by:` trailer in commit messages. The action will extract GitHub usernames from GitHub noreply emails (e.g., `username@users.noreply.github.com`) or use the full email address for other email domains. This will impact action performance as it requires scanning all commits. ie. ACKNOWLEDGE_COAUTHORS = "True" or ACKNOWLEDGE_COAUTHORS = "False" | | |
| **Note**: If `start_date` and `end_date` are specified then the action will determine if the contributor is new. A new contributor is one that has contributed in the date range specified but not before the start date. | |
| **Performance Note:** Using start and end dates will reduce speed of the action by approximately 63X. ie without dates if the action takes 1.7 seconds, it will take 1 minute and 47 seconds. | |
| **Co-authors Note:** When `ACKNOWLEDGE_COAUTHORS` is enabled, the action will scan commit messages for `Co-authored-by:` trailers and include those users as contributors. For GitHub noreply email addresses (e.g., `username@users.noreply.github.com`), the username will be extracted. For other email addresses (e.g., `john@example.com`), the full email address will be used as the contributor identifier. See [GitHub's documentation on creating commits with multiple authors](https://docs.github.com/en/pull-requests/committing-changes-to-your-project/creating-and-editing-commits/creating-a-commit-with-multiple-authors). | |
| | `ACKNOWLEDGE_COAUTHORS` | False | True | If you want to include co-authors from commit messages as contributors. Co-authors are identified via the `Co-authored-by:` trailer in commit messages. The action will extract GitHub usernames from GitHub noreply emails (e.g., `username@users.noreply.github.com`) and from `@github.com` email addresses (using the part before `@`), and for other email domains it will first attempt to resolve the email to a GitHub username via the GitHub Search Users API before falling back to using the full email address. This will impact action performance as it requires scanning all commits. ie. ACKNOWLEDGE_COAUTHORS = "True" or ACKNOWLEDGE_COAUTHORS = "False" | | |
| **Note**: If `start_date` and `end_date` are specified then the action will determine if the contributor is new. A new contributor is one that has contributed in the date range specified but not before the start date. | |
| **Performance Note:** Using start and end dates will reduce speed of the action by approximately 63X. ie without dates if the action takes 1.7 seconds, it will take 1 minute and 47 seconds. | |
| **Co-authors Note:** When `ACKNOWLEDGE_COAUTHORS` is enabled, the action will scan commit messages for `Co-authored-by:` trailers and include those users as contributors. For GitHub noreply email addresses (e.g., `username@users.noreply.github.com`), the username will be extracted. For `@github.com` email addresses (e.g., `username@github.com`), the part before `@` will be treated as the GitHub username. For other email addresses (e.g., `john@example.com`), the action will first attempt to resolve the email to a GitHub username using the GitHub Search Users API and, if no matching user is found, will fall back to using the full email address as the contributor identifier. See [GitHub's documentation on creating commits with multiple authors](https://docs.github.com/en/pull-requests/committing-changes-to-your-project/creating-and-editing-commits/creating-a-commit-with-multiple-authors). |
|
|
||
| # Extract co-authors from commit message | ||
| coauthors = get_coauthors_from_message(commit_message, github_connection) | ||
| for username in coauthors: |
There was a problem hiding this comment.
Bot accounts listed as co-authors via Co-authored-by: trailers will be included in the contributor list, while bot accounts that are regular contributors are filtered out (line 250). Consider applying the same bot filtering logic to co-authors for consistency by checking if "[bot]" is in the username before adding them to coauthor_counts.
| for username in coauthors: | |
| for username in coauthors: | |
| # Skip bot accounts for consistency with regular contributor filtering | |
| if "[bot]" in username.lower(): | |
| continue |
Pull Request
This pull request introduces support for acknowledging co-authors from commit messages as contributors, controlled by a new
ACKNOWLEDGE_COAUTHORSconfiguration option. It updates the main contributor-gathering logic, environment variable handling, documentation, and tests to support this feature. The changes ensure that users listed as co-authors viaCo-authored-by:trailers in commit messages are included in the contributor list, with special handling for converting email addresses into usernames, GitHub noreply and other email formats.Detailed breakdown of changes:
Feature: Acknowledge Co-authors as Contributors
ACKNOWLEDGE_COAUTHORS(defaultTrue) to.env-exampleand included documentation inREADME.mddescribing its purpose and usage. [1] [2]env.pyto read theACKNOWLEDGE_COAUTHORSenvironment variable and pass it through the codebase. [1] [2] [3] [4]Implementation: Co-author Extraction and Integration
get_coauthors_from_messageto parse commit messages forCo-authored-by:trailers, extracting GitHub usernames or email addresses as contributor identifiers.get_coauthor_contributorsto aggregate co-authors from commit messages and createContributorStatsobjects for them. Integrated this logic into the main contributor-gathering workflow viaget_contributorsandget_all_contributors. [1] [2] [3] [4] [5] [6] [7] [8]Testing: Unit Test Updates
test_contributors.pyto cover the new co-author acknowledgment logic and ensure correct integration with existing contributor gathering. [1] [2] [3] [4]Readiness Checklist
Author/Contributor
make lintand fix any issues that you have introducedmake testand ensure you have test coverage for the lines you are introducing@jeffrey-luszczReviewer
bug,documentation,enhancement,infrastructure,maintenanceorbreakingOriginal prompt
Co-authored-by:for this](https://docs.github.com/en/pull-requests/committing-changes-to-your-project/creating-and-editing-commits/creating-a-commit-with-multiple-authors). #372💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.