Skip to content

Conversation

@melvincarvalho
Copy link
Contributor

Summary

Fixes WebID-TLS authentication failures with HTML profiles containing embedded JSON-LD.

Changes

  1. Accept header - Request text/html instead of mixed types to work around conneg priority issues (Conneg ignores Accept header priority #71)

  2. JSON-LD regex - Use more flexible pattern to handle additional attributes on script tag (e.g., <script type="application/ld+json" id="me">)

Testing

Tested with WebID-TLS client certificate authentication against HTML profile with embedded JSON-LD.

Fixes #70

- Change Accept header to request text/html to avoid conneg issues
- Fix JSON-LD regex to handle additional attributes on script tag

Fixes #70
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes WebID-TLS authentication failures when HTML profiles contain embedded JSON-LD with additional attributes on the script tag. The fix addresses content negotiation issues and improves HTML parsing robustness.

Key Changes:

  • Modified Accept header to request text/html instead of multiple content types to work around conneg priority issues
  • Updated JSON-LD extraction regex to handle script tags with additional attributes (e.g., id, class)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

const response = await fetchWithTimeout(webId, {
headers: {
'Accept': 'application/ld+json, text/turtle, application/json'
'Accept': 'text/html'
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restricting the Accept header to only 'text/html' could break WebID-TLS authentication for profiles served exclusively as Turtle or JSON-LD. While this may work around conneg issues with HTML profiles, the code at lines 155-169 still handles multiple content types in the response. Consider using 'text/html, application/ld+json;q=0.9, text/turtle;q=0.8' to indicate HTML preference while maintaining backward compatibility with non-HTML profiles. Alternatively, if HTML-only is intentional, remove the dead code paths for handling Turtle and JSON-LD content types.

Suggested change
'Accept': 'text/html'
'Accept': 'text/html, application/ld+json;q=0.9, text/turtle;q=0.8'

Copilot uses AI. Check for mistakes.
} else if (contentType.includes('text/html')) {
// Try to extract JSON-LD from HTML data island
const jsonLdMatch = text.match(/<script\s+type=["']application\/ld\+json["']\s*>([\s\S]*?)<\/script>/i);
const jsonLdMatch = text.match(/<script[^>]*type=["']application\/ld\+json["'][^>]*>([\s\S]*?)<\/script>/i);
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated regex pattern for extracting JSON-LD from HTML profiles lacks test coverage. Since this change fixes authentication failures, consider adding tests that verify the regex correctly matches script tags with additional attributes (e.g., <script type="application/ld+json" id="me">) as well as the original format. This would prevent future regressions and validate the fix.

Copilot uses AI. Check for mistakes.
@melvincarvalho
Copy link
Contributor Author

Replies to Copilot Review

Re: Accept header (line 142)

Good point about backward compatibility. The quality value approach (text/html, application/ld+json;q=0.9, text/turtle;q=0.8) would be ideal. However, JSS conneg currently ignores quality values (see #71). Once that's fixed, we should update to use quality values.

For now, text/html works because:

  • JSS serves HTML profiles with embedded JSON-LD by default
  • Non-HTML-only WebID profiles are uncommon since they typically need human-readable content
  • The workaround is necessary until Conneg ignores Accept header priority #71 is resolved

Re: Test coverage (line 160)

Agreed. I'll add a test case for the regex with additional attributes. The existing test/webid-tls.test.js can be extended to cover:

  • <script type="application/ld+json"> (original format)
  • <script type="application/ld+json" id="me"> (with additional attributes)

Adds test cases for the HTML JSON-LD extraction regex to ensure it handles:
- Basic script tags
- Script tags with additional attributes (e.g., id="me")
- Script tags with attributes before type
- Single quotes
- Multiline content
- Non-matching script types
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WebID-TLS: Fix Accept header and JSON-LD regex for HTML profiles

1 participant