-
Notifications
You must be signed in to change notification settings - Fork 4
fix(webid-tls): improve HTML profile parsing #72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: gh-pages
Are you sure you want to change the base?
Conversation
- Change Accept header to request text/html to avoid conneg issues - Fix JSON-LD regex to handle additional attributes on script tag Fixes #70
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes WebID-TLS authentication failures when HTML profiles contain embedded JSON-LD with additional attributes on the script tag. The fix addresses content negotiation issues and improves HTML parsing robustness.
Key Changes:
- Modified Accept header to request
text/htmlinstead of multiple content types to work around conneg priority issues - Updated JSON-LD extraction regex to handle script tags with additional attributes (e.g.,
id,class)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const response = await fetchWithTimeout(webId, { | ||
| headers: { | ||
| 'Accept': 'application/ld+json, text/turtle, application/json' | ||
| 'Accept': 'text/html' |
Copilot
AI
Jan 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restricting the Accept header to only 'text/html' could break WebID-TLS authentication for profiles served exclusively as Turtle or JSON-LD. While this may work around conneg issues with HTML profiles, the code at lines 155-169 still handles multiple content types in the response. Consider using 'text/html, application/ld+json;q=0.9, text/turtle;q=0.8' to indicate HTML preference while maintaining backward compatibility with non-HTML profiles. Alternatively, if HTML-only is intentional, remove the dead code paths for handling Turtle and JSON-LD content types.
| 'Accept': 'text/html' | |
| 'Accept': 'text/html, application/ld+json;q=0.9, text/turtle;q=0.8' |
| } else if (contentType.includes('text/html')) { | ||
| // Try to extract JSON-LD from HTML data island | ||
| const jsonLdMatch = text.match(/<script\s+type=["']application\/ld\+json["']\s*>([\s\S]*?)<\/script>/i); | ||
| const jsonLdMatch = text.match(/<script[^>]*type=["']application\/ld\+json["'][^>]*>([\s\S]*?)<\/script>/i); |
Copilot
AI
Jan 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The updated regex pattern for extracting JSON-LD from HTML profiles lacks test coverage. Since this change fixes authentication failures, consider adding tests that verify the regex correctly matches script tags with additional attributes (e.g., <script type="application/ld+json" id="me">) as well as the original format. This would prevent future regressions and validate the fix.
Replies to Copilot ReviewRe: Accept header (line 142)Good point about backward compatibility. The quality value approach ( For now,
Re: Test coverage (line 160)Agreed. I'll add a test case for the regex with additional attributes. The existing
|
Adds test cases for the HTML JSON-LD extraction regex to ensure it handles: - Basic script tags - Script tags with additional attributes (e.g., id="me") - Script tags with attributes before type - Single quotes - Multiline content - Non-matching script types
Summary
Fixes WebID-TLS authentication failures with HTML profiles containing embedded JSON-LD.
Changes
Accept header - Request
text/htmlinstead of mixed types to work around conneg priority issues (Conneg ignores Accept header priority #71)JSON-LD regex - Use more flexible pattern to handle additional attributes on script tag (e.g.,
<script type="application/ld+json" id="me">)Testing
Tested with WebID-TLS client certificate authentication against HTML profile with embedded JSON-LD.
Fixes #70