Utility functions for string manipulation, sanitization, identifier normalization, ANSI stripping, URL parsing, and GitHub PAT validation.
The stringutil package provides utility functions for working with strings. It is organized into focused sub-files covering ANSI stripping, identifier normalization, sanitization, URL utilities, and PAT (Personal Access Token) validation.
The stringutil package is organized into focused sub-files:
| Sub-file | Functions |
|---|---|
stringutil.go |
General string helpers |
ansi.go |
ANSI escape-code stripping |
identifiers.go |
Workflow name and path normalization |
sanitize.go |
Security-sensitive string sanitization |
urls.go |
URL normalization and domain extraction |
pat_validation.go |
GitHub PAT classification and validation |
fuzzy_match.go |
Fuzzy string matching for "Did you mean?" suggestions |
| Type | Kind | Description |
|---|---|---|
SanitizeOptions |
struct | Options for SanitizeName (preserved characters, hyphen trimming, and default value) |
PATType |
string alias | Type of GitHub Personal Access Token (fine-grained, classic, oauth, unknown) with methods: String(), IsFineGrained(), IsValid() |
| Constant | Type | Value | Description |
|---|---|---|---|
PATTypeFineGrained |
PATType |
"fine-grained" |
Fine-grained PAT (prefix github_pat_) |
PATTypeClassic |
PATType |
"classic" |
Classic PAT (prefix ghp_) |
PATTypeOAuth |
PATType |
"oauth" |
OAuth token (prefix gho_) |
PATTypeUnknown |
PATType |
"unknown" |
Unrecognized token format |
Truncates s to at most maxLen characters, appending "..." when truncation occurs. For maxLen ≤ 3 the string is truncated without ellipsis.
stringutil.Truncate("hello world", 8) // "hello..."
stringutil.Truncate("hi", 8) // "hi"Normalizes trailing whitespace in multi-line content. Trims trailing spaces and tabs from every line, then ensures the content ends with exactly one newline (or is empty). This reduces spurious diffs caused by trailing-whitespace differences.
Removes shared leading indentation from non-empty lines in a multi-line string. This is useful for normalizing heredoc-like blocks while preserving relative indentation.
Converts a any-typed version value (typically from YAML parsing, which may produce int, float64, or string) into a string. Returns an empty string for nil.
stringutil.ParseVersionValue("20") // "20"
stringutil.ParseVersionValue(20) // "20"
stringutil.ParseVersionValue(20.0) // "20"Formats a slice of strings as a natural-language list with an Oxford comma.
stringutil.FormatList([]string{"a", "b", "c"}) // "a, b, and c"Returns true if and only if s is a decimal integer that is strictly greater than zero, has no leading zeros, and contains no non-digit characters. Returns false for "", "0", negative strings (e.g. "-5"), strings with leading zeros (e.g. "007"), and non-numeric strings.
Removes all ANSI/VT100 escape sequences from s. Handles CSI sequences (e.g. \x1b[31m for colors) and other ESC-prefixed sequences. This function is used before writing text into YAML files to prevent invisible characters from corrupting workflow output.
colored := "\x1b[32mSuccess\x1b[0m"
plain := stringutil.StripANSI(colored) // "Success"Removes .md and .lock.yml extensions from workflow names, returning the bare workflow identifier.
stringutil.NormalizeWorkflowName("weekly-research.md") // "weekly-research"
stringutil.NormalizeWorkflowName("weekly-research.lock.yml") // "weekly-research"
stringutil.NormalizeWorkflowName("weekly-research") // "weekly-research"Converts dashes and periods to underscores in safe-output identifiers, normalizing user-facing dash-separated and dot-separated formats to the internal underscore_separated format required by MCP tool names (which must match ^[a-zA-Z0-9_-]+$).
stringutil.NormalizeSafeOutputIdentifier("create-issue") // "create_issue"
stringutil.NormalizeSafeOutputIdentifier("executor-workflow.agent") // "executor_workflow_agent"Converts a workflow markdown path (.md) to its compiled lock file path (.lock.yml). Returns the path unchanged if it already ends with .lock.yml.
stringutil.MarkdownToLockFile(".github/workflows/test.md")
// → ".github/workflows/test.lock.yml"Converts a compiled lock file path (.lock.yml) back to its markdown source path (.md). Returns the path unchanged if it already ends with .md.
stringutil.LockFileToMarkdown(".github/workflows/test.lock.yml")
// → ".github/workflows/test.md"These functions remove sensitive information to prevent accidental leakage in logs or error messages.
Sanitizes a name for identifiers and filenames using configurable behavior (preserved special characters, optional hyphen trimming, and fallback default value).
Redacts potential secret key names from error messages. Matches uppercase SNAKE_CASE identifiers (e.g. MY_SECRET_KEY, API_TOKEN) and PascalCase identifiers ending with security-related suffixes (e.g. GitHubToken, ApiKey). Common GitHub Actions workflow keywords (GITHUB, RUNNER, WORKFLOW, etc.) are excluded from redaction.
stringutil.SanitizeErrorMessage("Error: MY_SECRET_TOKEN is invalid")
// → "Error: [REDACTED] is invalid"Sanitizes a string for use as a programming-language identifier by replacing invalid characters with underscores and prefixing _ when the identifier starts with a digit. extraAllowed can be used to permit additional runes beyond the normal identifier rules; if extraAllowed is nil, no extra characters are allowed.
Sanitizes a parameter name for use as a GitHub Actions output or environment variable name. Preserves letters, digits, $, and _, and replaces all other characters with underscores.
Sanitizes a string for use as a Python variable name. Similar to SanitizeParameterName but follows Python identifier rules.
Sanitizes a tool identifier for safe use in generated code. Replaces characters that are not valid in identifiers with underscores.
Converts a string into a filesystem-safe filename by lowercasing and replacing non-alphanumeric characters with hyphens.
Normalizes a GitHub host URL by ensuring it has an https:// scheme and no trailing slash. Accepts bare hostnames, URLs with or without a scheme, and URLs with trailing slashes.
stringutil.NormalizeGitHubHostURL("github.example.com") // "https://github.example.com"
stringutil.NormalizeGitHubHostURL("https://github.com/") // "https://github.com"Extracts the hostname (without port) from a URL string. Falls back to simple string parsing when url.Parse cannot handle the input.
stringutil.ExtractDomainFromURL("https://api.github.com/repos") // "api.github.com"A string type representing the category of a GitHub Personal Access Token.
| Constant | Value | Prefix |
|---|---|---|
PATTypeFineGrained |
"fine-grained" |
github_pat_ |
PATTypeClassic |
"classic" |
ghp_ |
PATTypeOAuth |
"oauth" |
gho_ |
PATTypeUnknown |
"unknown" |
(other) |
Methods: String() string, IsFineGrained() bool, IsValid() bool
Determines the token type from its prefix.
Returns nil if the token is a fine-grained PAT; returns an actionable error message with a link to create the correct token type otherwise.
if err := stringutil.ValidateCopilotPAT(token); err != nil {
fmt.Fprintln(os.Stderr, console.FormatErrorMessage(err.Error()))
}Returns a human-readable description of the token type (e.g. "fine-grained personal access token").
Finds the closest matching strings using Levenshtein distance. Returns up to maxResults matches that have a distance of 3 or less. Results are sorted by distance (closest first), then alphabetically for ties. Case-insensitive matching. Exact matches are excluded.
This function is useful for "Did you mean?" suggestions when a user provides an unrecognized value (e.g., a typo in an engine name or event type).
engines := []string{"copilot", "claude", "codex", "custom"}
matches := stringutil.FindClosestMatches("copiliot", engines, 3)
// → ["copilot"]Computes the Levenshtein distance between two strings — the minimum number of single-character edits (insertions, deletions, or substitutions) required to change one string into the other. Uses dynamic programming with space optimization (only the previous row is stored).
import "github.com/github/gh-aw/pkg/stringutil"
// Truncate a long string for display
stringutil.Truncate("hello world", 8) // "hello..."
// Strip ANSI color codes from terminal output
plain := stringutil.StripANSI("\x1b[32mSuccess\x1b[0m") // "Success"
// Normalize workflow names
stringutil.NormalizeWorkflowName("weekly-research.md") // "weekly-research"
stringutil.NormalizeWorkflowName("weekly-research.lock.yml") // "weekly-research"
// Convert markdown path to lock file and back
stringutil.MarkdownToLockFile(".github/workflows/test.md") // ".github/workflows/test.lock.yml"
stringutil.LockFileToMarkdown(".github/workflows/test.lock.yml") // ".github/workflows/test.md"
// Redact secrets from error messages
stringutil.SanitizeErrorMessage("Error: MY_SECRET_TOKEN is invalid")
// → "Error: [REDACTED] is invalid"
// Normalize a GitHub host URL
stringutil.NormalizeGitHubHostURL("github.example.com") // "https://github.example.com"
// Validate a Copilot PAT
if err := stringutil.ValidateCopilotPAT(token); err != nil {
fmt.Fprintln(os.Stderr, console.FormatErrorMessage(err.Error()))
}
// Find closest matches for "Did you mean?" suggestions
engines := []string{"copilot", "claude", "codex", "custom"}
matches := stringutil.FindClosestMatches("copiliot", engines, 3)
// → ["copilot"]
// Compute Levenshtein distance
distance := stringutil.LevenshteinDistance("copiliot", "copilot")
// → 1Internal:
github.com/github/gh-aw/pkg/logger— debug logging
- All debug output uses namespace-prefixed loggers (
stringutil:identifiers,stringutil:sanitize,stringutil:urls,stringutil:pat_validation) and is only emitted whenDEBUG=stringutil:*. SanitizeErrorMessageis intentionally conservative: it excludes common GitHub Actions keywords to avoid over-redacting legitimate error messages.StripANSIhandles both CSI sequences (ESC[) and other ESC-prefixed sequences to cover the full range of ANSI escape codes found in terminal output.
All functions in this package are stateless pure functions operating on immutable string inputs. They are safe to call concurrently from multiple goroutines without synchronization.
This specification is automatically maintained by the spec-extractor workflow.