Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion scanners/amass/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: "Amass"
category: "scanner"
type: "Network"
state: "released"
appVersion: "v3.14"
appVersion: "v3.14.1"
usecase: "Subdomain Enumeration Scanner"
---

Expand Down
6 changes: 4 additions & 2 deletions scanners/git-repo-scanner/.helm-docs.gotmpl
Original file line number Diff line number Diff line change
Expand Up @@ -46,21 +46,22 @@ or
For type GitHub you can use the following options:
- `--organization`: The name of the GitHub organization you want to scan.
- `--url`: The url of the api for a GitHub enterprise server. Skip this option for repos on <https://github.com>.
- `--access-token`: Your personal GitHub access token.
- `--access-token`: Your personal GitHub access token (needs full `repo` rights if you want to also find private repositories, otherwise `repo:status` and `public_repo` is sufficient).
- `--ignore-repos`: A list of GitHub repository ids you want to ignore
- `--obey-rate-limit`: True to obey the rate limit of the GitHub server (default), otherwise False
- `--activity-since-duration`: Return git repo findings with repo activity (e.g. commits) more recent than a specific date expressed by a duration (now + duration). A duration string is a possibly signed sequence of decimal numbers, each
with optional fraction and a unit suffix, such as '1h' or '2h45m'. Valid time units are 'm', 'h', 'd', 'w'.
- `--activity-until-duration`: Return git repo findings with repo activity (e.g. commits) older than a specific date expressed by a duration (now + duration). A duration string is a possibly signed sequence of decimal numbers, each with
optional fraction and a unit suffix, such as '1h' or '2h45m'. Valid time units are 'm', 'h', 'd', 'w'.
- `--annotate-latest-commit-id`: Set to True to annotate the results with the SHA1 of the latest commit on the main branch. Causes an extra API hit per repository. False by default.

For now only organizations are supported, so the option is mandatory. We **strongly recommend** providing an access token
for authentication. If not provided the rate limiting will kick in after about 30 repositories scanned.

#### GitLab
For type GitLab you can use the following options:
- `--url`: The url of the GitLab server.
- `--access-token`: Your personal GitLab access token.
- `--access-token`: Your personal GitLab access token (needs at least `read_api` and `read_repository` scopes).
- `--group`: A specific GitLab group id you want to san, including subgroups.
- `--ignore-groups`: A list of GitLab group ids you want to ignore
- `--ignore-repos`: A list of GitLab project ids you want to ignore
Expand All @@ -69,6 +70,7 @@ For type GitLab you can use the following options:
with optional fraction and a unit suffix, such as '1h' or '2h45m'. Valid time units are 'm', 'h', 'd', 'w'.
- `--activity-until-duration`: Return git repo findings with repo activity (e.g. commits) older than a specific date expressed by a duration (now + duration). A duration string is a possibly signed sequence of decimal numbers, each with
optional fraction and a unit suffix, such as '1h' or '2h45m'. Valid time units are 'm', 'h', 'd', 'w'.
- `--annotate-latest-commit-id`: Set to True to annotate the results with the SHA1 of the latest commit on the main branch. Causes an extra API hit per repository. False by default.


For Gitlab, the url and the access token is mandatory. If you don't provide a specific group id, all projects
Expand Down
2 changes: 1 addition & 1 deletion scanners/git-repo-scanner/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ description: A Helm chart for the git-repo-scanner that integrates with the secu
type: application
# version - gets automatically set to the secureCodeBox release version when the helm charts gets published
version: v3.1.0-alpha1
appVersion: "1.0"
appVersion: "1.1"
kubeVersion: ">=v1.11.0-0"

keywords:
Expand Down
8 changes: 5 additions & 3 deletions scanners/git-repo-scanner/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: "Git Repo Scanner"
category: "scanner"
type: "Repository"
state: "released"
appVersion: "1.0"
appVersion: "1.1"
usecase: "Discover Git repositories"
---

Expand Down Expand Up @@ -62,21 +62,22 @@ or
For type GitHub you can use the following options:
- `--organization`: The name of the GitHub organization you want to scan.
- `--url`: The url of the api for a GitHub enterprise server. Skip this option for repos on <https://github.com>.
- `--access-token`: Your personal GitHub access token.
- `--access-token`: Your personal GitHub access token (needs full `repo` rights if you want to also find private repositories, otherwise `repo:status` and `public_repo` is sufficient).
- `--ignore-repos`: A list of GitHub repository ids you want to ignore
- `--obey-rate-limit`: True to obey the rate limit of the GitHub server (default), otherwise False
- `--activity-since-duration`: Return git repo findings with repo activity (e.g. commits) more recent than a specific date expressed by a duration (now + duration). A duration string is a possibly signed sequence of decimal numbers, each
with optional fraction and a unit suffix, such as '1h' or '2h45m'. Valid time units are 'm', 'h', 'd', 'w'.
- `--activity-until-duration`: Return git repo findings with repo activity (e.g. commits) older than a specific date expressed by a duration (now + duration). A duration string is a possibly signed sequence of decimal numbers, each with
optional fraction and a unit suffix, such as '1h' or '2h45m'. Valid time units are 'm', 'h', 'd', 'w'.
- `--annotate-latest-commit-id`: Set to True to annotate the results with the SHA1 of the latest commit on the main branch. Causes an extra API hit per repository. False by default.

For now only organizations are supported, so the option is mandatory. We **strongly recommend** providing an access token
for authentication. If not provided the rate limiting will kick in after about 30 repositories scanned.

#### GitLab
For type GitLab you can use the following options:
- `--url`: The url of the GitLab server.
- `--access-token`: Your personal GitLab access token.
- `--access-token`: Your personal GitLab access token (needs at least `read_api` and `read_repository` scopes).
- `--group`: A specific GitLab group id you want to san, including subgroups.
- `--ignore-groups`: A list of GitLab group ids you want to ignore
- `--ignore-repos`: A list of GitLab project ids you want to ignore
Expand All @@ -85,6 +86,7 @@ For type GitLab you can use the following options:
with optional fraction and a unit suffix, such as '1h' or '2h45m'. Valid time units are 'm', 'h', 'd', 'w'.
- `--activity-until-duration`: Return git repo findings with repo activity (e.g. commits) older than a specific date expressed by a duration (now + duration). A duration string is a possibly signed sequence of decimal numbers, each with
optional fraction and a unit suffix, such as '1h' or '2h45m'. Valid time units are 'm', 'h', 'd', 'w'.
- `--annotate-latest-commit-id`: Set to True to annotate the results with the SHA1 of the latest commit on the main branch. Causes an extra API hit per repository. False by default.

For Gitlab, the url and the access token is mandatory. If you don't provide a specific group id, all projects
on the Gitlab server are going to be discovered.
Expand Down
12 changes: 10 additions & 2 deletions scanners/git-repo-scanner/scanner/git_repo_scanner/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,15 +52,17 @@ def process(args):
group=args.group,
ignored_groups=args.ignore_groups,
ignore_repos=args.ignore_repos,
obey_rate_limit=args.obey_rate_limit
obey_rate_limit=args.obey_rate_limit,
annotate_latest_commit_id=args.annotate_latest_commit_id
)
elif args.git_type == 'github':
scanner = GitHubScanner(
url=args.url,
access_token=args.access_token,
organization=args.organization,
ignore_repos=args.ignore_repos,
obey_rate_limit=args.obey_rate_limit
obey_rate_limit=args.obey_rate_limit,
annotate_latest_commit_id=args.annotate_latest_commit_id
)
else:
logger.info('Argument error: Unknown git type')
Expand Down Expand Up @@ -146,6 +148,12 @@ def get_parser_args(args=None):
type=bool,
default=True,
required=False)
parser.add_argument('--annotate-latest-commit-id',
help="Annotate the results with the latest commit hash of the main branch of the repository. "
"Will result in up to two extra API hits per repository",
type=bool,
default=False,
required=False)
parser.add_argument('--activity-since-duration',
help='Return git repo findings with repo activity (e.g. commits) more recent than a specific '
'date expressed by a duration (now - duration)',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,9 @@ def process(self, start_time: Optional[datetime] = None, end_time: Optional[date
raise NotImplementedError()

def _create_finding(self, repo_id: str, web_url: str, full_name: str, owner_type: str, owner_id: str,
owner_name: str, created_at: str, last_activity_at: str, visibility: str) -> FINDING:
return {
owner_name: str, created_at: str, last_activity_at: str, visibility: str,
last_commit_id: str = None) -> FINDING:
finding = {
'name': f'{self.git_type} Repo',
'description': f'A {self.git_type} repository',
'category': 'Git Repository',
Expand All @@ -40,3 +41,6 @@ def _create_finding(self, repo_id: str, web_url: str, full_name: str, owner_type
'visibility': visibility
}
}
if last_commit_id is not None:
finding["attributes"]["last_commit_id"] = last_commit_id
return finding
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ class GitHubScanner(AbstractScanner):
LOGGER = logging.getLogger('git_repo_scanner')

def __init__(self, url: Optional[str], access_token: Optional[str], organization: str, ignore_repos: List[int],
obey_rate_limit: bool = True) -> None:
obey_rate_limit: bool = True, annotate_latest_commit_id: bool = False) -> None:
super().__init__()
if not organization:
raise argparse.ArgumentError(None, 'Organization required for GitHub connection.')
Expand All @@ -33,6 +33,7 @@ def __init__(self, url: Optional[str], access_token: Optional[str], organization
self._organization = organization
self._ignore_repos = ignore_repos
self._obey_rate_limit = obey_rate_limit
self._annotate_latest_commit_id = annotate_latest_commit_id
self._gh: Optional[github.Github] = None

@property
Expand Down Expand Up @@ -125,6 +126,13 @@ def _setup_with_url(self):
raise argparse.ArgumentError(None, 'Access token required for github enterprise authentication.')

def _create_finding_from_repo(self, repo: Repository) -> FINDING:
latest_commit: str = None
if self._annotate_latest_commit_id:
try:
latest_commit = repo.get_commits()[0].sha
except Exception:
self.LOGGER.warn("Could not identify the latest commit ID - repository without commits?")
latest_commit = ""
return super()._create_finding(
str(repo.id),
repo.html_url,
Expand All @@ -134,5 +142,6 @@ def _create_finding_from_repo(self, repo: Repository) -> FINDING:
repo.owner.name,
repo.created_at.strftime("%Y-%m-%dT%H:%M:%SZ"),
repo.updated_at.strftime("%Y-%m-%dT%H:%M:%SZ"),
'private' if repo.private else 'public'
'private' if repo.private else 'public',
latest_commit
)
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@ def __init__(self, url: str,
group: Optional[int],
ignored_groups: List[int],
ignore_repos: List[int],
obey_rate_limit: bool = True) -> None:
obey_rate_limit: bool = True,
annotate_latest_commit_id: bool = False) -> None:
super().__init__()
if not url:
raise argparse.ArgumentError(None, 'URL required for GitLab connection.')
Expand All @@ -36,6 +37,7 @@ def __init__(self, url: str,
self._ignored_groups = ignored_groups
self._ignore_repos = ignore_repos
self._obey_rate_limit = obey_rate_limit
self._annotate_latest_commit_id = annotate_latest_commit_id
self._gl: Optional[gitlab.Gitlab] = None

@property
Expand All @@ -47,6 +49,12 @@ def process(self, start_time: Optional[datetime] = None, end_time: Optional[date

projects: List[Project] = self._get_projects(start_time, end_time)
return self._process_projects(projects)

def _group_project_to_project(self, group_project):
# The GitLab API library gives us a GroupProject object, which has limited functionality.
# This function turns the GroupProject into a "real" project, which allows us to get the
# list of commits and include the SHA1 of the latest commit in the output later
return self._gl.projects.get(group_project.id, lazy=True)

def _get_projects(self, start_time: Optional[datetime], end_time: Optional[datetime]):
logger.info(f'Get GitLab repositories with last activity between {start_time} and {end_time}.')
Expand Down Expand Up @@ -103,6 +111,15 @@ def _create_finding_from_project(self, project: Project, index: int, total: int)
logger.info(
f'({index + 1}/{total}) Add finding for repo {project.name} with last activity at '
f'{datetime.fromisoformat(project.last_activity_at)}')

# Retrieve the latest commit ID
latest_commit_id: str = None
if self._annotate_latest_commit_id:
try:
latest_commit_id = self._group_project_to_project(project).commits.list()[0].id
except Exception as e:
logger.warn("Could not identify the latest commit ID - repository without commits?")
latest_commit_id = ""
return super()._create_finding(
project.id,
project.web_url,
Expand All @@ -112,5 +129,6 @@ def _create_finding_from_project(self, project: Project, index: int, total: int)
project.namespace['name'],
project.created_at,
project.last_activity_at,
project.visibility
project.visibility,
latest_commit_id
)
Loading