Skip to content

Conversation

@jesseditson
Copy link

Fixes #148

@@ -1,7 +1,19 @@

var URL = require('url');
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the intent would be clearer to rather test if the global URL is set and require it otherwise?

Or we could also rely directly on it since it is available since node@10.

What do you think?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense - you'd probably know better than me if there are folks depending on this using ancient node versions, but as long as this came in a major bump IMO it'd be simplest to just use the global URL since it is available in all maintenance+ releases of node

}

var parts = URL.parse(url, null, true);
var parts = parseURL(url);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻

Also discovers & resolves a few URL related issues:
- WHATWG parses any potentially valid integer string as an IP - add explicit IP detection to avoid this case and preserve existing parsing functionality
- WHATWG does not fuzzy parse ipv6 addresses, so add simple regex-based ipv6 extraction to handle this case and preserve existing functionality
- WHATWG does not parse scheme-less URLs so change prefixing behavior to include a scheme when falling back to URL-based parsing
Copy link
Author

@jesseditson jesseditson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WHATWG URL parsing turns out to be quite subtly different. I did a pass on the unit tests to get it all passing but there are potentially some concessions here that are undesirable (mainly using some regexes to try to align with the legacy parsing behavior).

Comment on lines -92 to 82

var needsTrimming = checkTrimmingNeeded(url);
if (needsTrimming) {
url = url.trim();
}

var needsLowerCase = checkLowerCaseNeeded(url);
if (needsLowerCase) {
url = url.toLowerCase();
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diff is a little funky, but I just moved the lowercase check to before the ipv6 check since the later IP checks expect a lowercased string.

Comment on lines -109 to +101
url = '//' + url;
url = 'https://' + url;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WHATWG URL does not support scheme-less URLs so I had to add a scheme to get it to parse.

Comment on lines +111 to +113
if (ipLikeRE.test(parts.hostname) && !ipLikeRE.test(value)) {
return value;
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WHATWG URL will parse the string http://42 as the URL http://0.0.0.42, which is not how the legacy URL parser worked (it would fail to parse). This just aligns them. A slight difference in this is that when value is not a string, we will no longer coerce it to one when returning it. I believe this is closer to the documented API but is technically a breaking change. LMK if you'd like me to restore the test and the string-coercion.


it('should return the initial value if it is not a valid hostname', function(){
expect(tld.extractHostname(42)).to.equal('42');
expect(tld.extractHostname(42)).to.equal(42);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above - "return the initial value" feels like it should return an integer, not a string. But this is a change in behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support environments other than node.js that have a global URL class

2 participants