Skip to content

Commit b16cf6e

Browse files
committed
Improved show-external links script to actually test the links and see if they 404
1 parent 77dc863 commit b16cf6e

File tree

2 files changed

+128
-5
lines changed

2 files changed

+128
-5
lines changed

TESTING.md

Lines changed: 31 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,23 @@ You can verify that no external links have been accidentally added to the docume
7373

7474
This script searches all `.md` files for external links (http:// or https://) in markdown-style links, HTML anchor tags, and image tags. The output will show which files contain external links and what they are. This is useful for quality assurance before committing changes.
7575

76+
**Validating External Links:**
77+
78+
You can also validate the HTTP status codes of all external links to catch broken links:
79+
80+
```bash
81+
# Check external links with HTTP status validation
82+
./show-external-links.sh --check
83+
84+
# With custom timeout (default is 5 seconds)
85+
./show-external-links.sh --check --timeout 10
86+
```
87+
88+
Status codes:
89+
- 🟢 `` = Valid (HTTP 200)
90+
- 🟡 `` = Warning (HTTP 3xx redirects)
91+
- 🔴 `` = Error (HTTP 4xx/5xx or connection failed)
92+
7693
> [!Important]
7794
> Pay attention to your use of external links and consider any complexities around linking to forks of this documentation repository. Wherever possible content should be local and forks can then modify content as required.
7895
@@ -126,4 +143,17 @@ The script is also integrated into CI/CD to prevent deployment with improperly l
126143
- Always build locally before pushing changes
127144
- The navigation sidebar must not contain external links (validation enforced by CI)
128145
- Use relative links for internal documentation pages
129-
- Orphaned pages must be in the `Orphans/` directory (validation enforced by CI)
146+
- Orphaned pages must be in the `Orphans/` directory (validation enforced by CI)
147+
- External link HTTP status checking is available locally with `./show-external-links.sh --check`
148+
149+
## CI/CD Validation
150+
151+
The following checks are automatically run on every push to the main branch via GitHub Actions:
152+
153+
1. **Navigation Link Validation** - Ensures no external links exist in `Sidebar.md`
154+
2. **Internal Link Checking** - Validates all internal `.md` links point to existing files
155+
3. **Orphan Page Validation** - Ensures orphaned pages are only in the `Orphans/` directory
156+
4. **Jekyll Build** - Builds the site to ensure no build errors
157+
5. **Deployment** - Deploys to GitHub Pages only if all checks pass
158+
159+
If any validation fails, deployment is prevented and the PR cannot be merged until issues are resolved.

show-external-links.sh

Lines changed: 97 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,49 @@
11
#!/bin/bash
22

33
# Script to find and list all external links in markdown files
4+
# Also validates HTTP status codes for external links
45
# Searches for both markdown-style links [text](http://...) and HTML <a href="http://..."> tags
56

7+
# Colors for output
8+
RED='\033[0;31m'
9+
GREEN='\033[0;32m'
10+
YELLOW='\033[1;33m'
11+
BLUE='\033[0;34m'
12+
NC='\033[0m' # No Color
13+
14+
# Options
15+
CHECK_LINKS=false
16+
TIMEOUT=5
17+
18+
# Parse command line arguments
19+
while [[ $# -gt 0 ]]; do
20+
case $1 in
21+
--check)
22+
CHECK_LINKS=true
23+
shift
24+
;;
25+
--timeout)
26+
TIMEOUT="$2"
27+
shift 2
28+
;;
29+
*)
30+
shift
31+
;;
32+
esac
33+
done
34+
635
echo "Searching for external links in .md files..."
36+
if [[ "$CHECK_LINKS" == true ]]; then
37+
echo "HTTP status checking enabled (timeout: ${TIMEOUT}s)"
38+
fi
739
echo ""
840

41+
# Track results
42+
total_links=0
43+
checked_links=0
44+
broken_links=0
45+
error_links=()
46+
947
# Find all .md files and search for external links
1048
find wiki-default -name "*.md" -type f | while read -r file; do
1149
# Search for markdown-style links with http:// or https://
@@ -19,15 +57,31 @@ find wiki-default -name "*.md" -type f | while read -r file; do
1957

2058
# If any external links found, display them
2159
if [ -n "$markdown_links" ] || [ -n "$html_links" ] || [ -n "$img_links" ]; then
22-
echo "File: $file"
60+
echo -e "File: ${BLUE}$file${NC}"
2361
echo "----------------------------------------"
2462

2563
if [ -n "$markdown_links" ]; then
2664
echo "Markdown links:"
2765
echo "$markdown_links" | while read -r link; do
2866
# Extract URL from markdown link
2967
url=$(echo "$link" | grep -oP '\((https?://[^\)]+)\)' | tr -d '()')
30-
echo " - $url"
68+
69+
if [[ "$CHECK_LINKS" == true ]] && [[ ! -z "$url" ]]; then
70+
# Check HTTP status
71+
status=$(curl -s -o /dev/null -w "%{http_code}" --max-time "$TIMEOUT" "$url" 2>/dev/null || echo "000")
72+
73+
if [[ "$status" == "200" ]]; then
74+
echo -e " - ${GREEN}${NC} $url (HTTP $status)"
75+
elif [[ "$status" == "000" ]]; then
76+
echo -e " - ${RED}${NC} $url (${RED}Connection error${NC})"
77+
elif [[ "$status" =~ ^[45] ]]; then
78+
echo -e " - ${RED}${NC} $url (HTTP ${RED}$status${NC})"
79+
else
80+
echo -e " - ${YELLOW}${NC} $url (HTTP $status)"
81+
fi
82+
else
83+
echo " - $url"
84+
fi
3185
done
3286
fi
3387

@@ -36,7 +90,23 @@ find wiki-default -name "*.md" -type f | while read -r file; do
3690
echo "$html_links" | while read -r link; do
3791
# Extract URL from HTML href
3892
url=$(echo "$link" | grep -oP 'href=["'"'"']\K(https?://[^"'"'"']+)')
39-
echo " - $url"
93+
94+
if [[ "$CHECK_LINKS" == true ]] && [[ ! -z "$url" ]]; then
95+
# Check HTTP status
96+
status=$(curl -s -o /dev/null -w "%{http_code}" --max-time "$TIMEOUT" "$url" 2>/dev/null || echo "000")
97+
98+
if [[ "$status" == "200" ]]; then
99+
echo -e " - ${GREEN}${NC} $url (HTTP $status)"
100+
elif [[ "$status" == "000" ]]; then
101+
echo -e " - ${RED}${NC} $url (${RED}Connection error${NC})"
102+
elif [[ "$status" =~ ^[45] ]]; then
103+
echo -e " - ${RED}${NC} $url (HTTP ${RED}$status${NC})"
104+
else
105+
echo -e " - ${YELLOW}${NC} $url (HTTP $status)"
106+
fi
107+
else
108+
echo " - $url"
109+
fi
40110
done
41111
fi
42112

@@ -45,7 +115,23 @@ find wiki-default -name "*.md" -type f | while read -r file; do
45115
echo "$img_links" | while read -r link; do
46116
# Extract URL from HTML src
47117
url=$(echo "$link" | grep -oP 'src=["'"'"']\K(https?://[^"'"'"']+)')
48-
echo " - $url"
118+
119+
if [[ "$CHECK_LINKS" == true ]] && [[ ! -z "$url" ]]; then
120+
# Check HTTP status
121+
status=$(curl -s -o /dev/null -w "%{http_code}" --max-time "$TIMEOUT" "$url" 2>/dev/null || echo "000")
122+
123+
if [[ "$status" == "200" ]]; then
124+
echo -e " - ${GREEN}${NC} $url (HTTP $status)"
125+
elif [[ "$status" == "000" ]]; then
126+
echo -e " - ${RED}${NC} $url (${RED}Connection error${NC})"
127+
elif [[ "$status" =~ ^[45] ]]; then
128+
echo -e " - ${RED}${NC} $url (HTTP ${RED}$status${NC})"
129+
else
130+
echo -e " - ${YELLOW}${NC} $url (HTTP $status)"
131+
fi
132+
else
133+
echo " - $url"
134+
fi
49135
done
50136
fi
51137

@@ -54,3 +140,10 @@ find wiki-default -name "*.md" -type f | while read -r file; do
54140
done
55141

56142
echo "Search complete."
143+
if [[ "$CHECK_LINKS" == true ]]; then
144+
echo ""
145+
echo -e "${GREEN}${NC} = Valid (HTTP 200)"
146+
echo -e "${YELLOW}${NC} = Warning (HTTP 3xx redirect)"
147+
echo -e "${RED}${NC} = Error (HTTP 4xx/5xx or connection failed)"
148+
fi
149+

0 commit comments

Comments
 (0)