Skip to content

Reject non-URI-safe hosts in HostSpecifier#8459

Open
bg0d-droid wants to merge 1 commit into
google:masterfrom
bg0d-droid:fix-hostspecifier-uri-host-validation
Open

Reject non-URI-safe hosts in HostSpecifier#8459
bg0d-droid wants to merge 1 commit into
google:masterfrom
bg0d-droid:fix-hostspecifier-uri-host-validation

Conversation

@bg0d-droid

Copy link
Copy Markdown

This tightens HostSpecifier so hosts accepted through the IP-literal and domain-name paths remain suitable for use as URI hosts.

Changes:

  • Skip InetAddresses.forString for non-ASCII input so non-ASCII digit IP literals are not accepted and canonicalized.
  • Require accepted domain names to be ASCII and underscore-free.
  • Add JRE and Android regression coverage for full-width IPv4 digits and underscore domains.

Tests:

  • git diff --check
  • Targeted javac/java verification for guava/src/com/google/common/net/HostSpecifier.java
  • Targeted javac/java verification for android/guava/src/com/google/common/net/HostSpecifier.java
  • ./mvnw -Dmaven.repo.local=/tmp/guava-m2 -pl guava-tests -am -Dtest=HostSpecifierTest test did not complete locally because toolchain auto-install attempted to download Temurin JDK 26 and failed on a partial archive.

@google-cla

google-cla Bot commented Jun 2, 2026

Copy link
Copy Markdown

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@bg0d-droid bg0d-droid force-pushed the fix-hostspecifier-uri-host-validation branch from aa13fce to 58f6a14 Compare June 2, 2026 16:21
@bg0d-droid bg0d-droid force-pushed the fix-hostspecifier-uri-host-validation branch from 58f6a14 to bf0b546 Compare June 2, 2026 16:24
@cpovirk

cpovirk commented Jun 8, 2026

Copy link
Copy Markdown
Member

This is a case in which we might have a hard time finding the expertise to figure out what to do here. We have tests that expect both to be able to pass domains with underscores (seen in legacy Amazon S3 buckets, I'm reading) and to be able to pass non-ASCII characters (typically converted to Punycode automatically by browsers). We've also seen (non-HostSpecifier) code that tests with non-ASCII digits.

@bg0d-droid

Copy link
Copy Markdown
Author

Thanks, that makes sense.

My intent was not to change the broader behavior of InternetDomainName or InetAddresses. I was focusing on HostSpecifier because its class-level contract says the value is suitable for use in a URI, while these inputs can produce a parser differential with java.net.URI#getHost().

I agree that rejecting all non-ASCII domain names may be too broad if callers expect browser-like IDN handling. A narrower option could be to canonicalize IDNs through IDN.toASCII so HostSpecifier still returns a URI-safe host form instead of rejecting them outright.

The underscore case seems harder because java.net.URI does not treat foo_bar.com as a host at all, so preserving underscore compatibility conflicts with the current HostSpecifier URI-suitability wording. If preserving legacy underscore support is preferred, then maybe the safer change is documentation/API guidance rather than changing HostSpecifier.isValid behavior.

Which direction would you prefer for this PR: a narrower IDN canonicalization patch, documentation clarification, or keeping the stricter URI-host validation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants