Skip to content

Conversation

@wkerckho
Copy link
Contributor

📁 Related issues

#1184

✍️ Description

Modified encodeUriPathComponents and decodeUriPathComponents in PathUtil.ts so that url encoded path separator characters (/, \) no longer erroneously impact the file path that is targeted when using a file backend. The implementation uses two new helper functions escapeCodecExceptionsForEncoding and escapeCodecExceptionsForDecoding that prepare a path part for encoding and decoding respectively, by replacing all instances of the matching encoded file separator characters with a character or character sequence that results in the follow-up encode/decode to have no effect. The code can be extended with additional path separator characters (or other exceptional characters) to ignore during encoding/decoding, by adding the respective characters to the codecExceptions constant initializer.

As requested, I've extended the integration tests with FileBackendEncodedSlashHandling.test.ts, which checks the operations that could not be executed successfully before this fix.

✅ PR check list

Before this pull request can be merged, a core maintainer will check whether

  • this PR is labeled with the correct semver label
    • semver.patch: Backwards compatible bug fixes.
    • semver.minor: Backwards compatible feature additions.
    • semver.major: Breaking changes. This includes changing interfaces or configuration behaviour.
  • the correct branch is targeted. Patch updates can target main, other changes should target the latest versions/* branch.
  • the RELEASE_NOTES.md document in case of relevant feature or config changes.
  • any relevant documentation was updated to reflect the changes in this PR.

@wkerckho wkerckho added the semver.patch Does not require a minor or major version bump label Apr 14, 2022
@wkerckho wkerckho requested a review from joachimvh April 14, 2022 14:34
@RubenVerborgh RubenVerborgh self-requested a review April 14, 2022 15:07
Copy link
Member

@RubenVerborgh RubenVerborgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work!
A couple of nitpicks (given how complicated this particular escaping is).

import { BadRequestHttpError } from './errors/BadRequestHttpError';

// Characters (mapped to their encoded representation) to ignore when URL decoding the URI path to a file path.
const codecExceptions = new Map([ '/', '\\' ].map((item): [string, string] => [ item, encodeURIComponent(item) ]));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's name this pathComponentDelimiters (as per solid/specification#381 (comment)) and use a regular JavaScript object, which are the more straightforward option when keys are strings (or symbols). We can possibly use object.fromEntries.

* @param path - The path to apply the escaping on.
* @returns A copy of the input path that is safe to apply URL decoding on.
*/
function escapeCodecExceptionsForDecoding(path: string): string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
function escapeCodecExceptionsForDecoding(path: string): string {
function preventDelimiterDecoding(pathComponent: string): string {

Comment on lines 22 to 24
codecExceptions.forEach((value, key): void => {
path = path.replace(codecExceptionsRegexCache.get(key)!, `%25${value.slice(1)}`);
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A (double) loop is expensive here; I think we should just go for

Suggested change
codecExceptions.forEach((value, key): void => {
path = path.replace(codecExceptionsRegexCache.get(key)!, `%25${value.slice(1)}`);
});
return pathComponent.replace(/\/|\\/gu, encodeComponentDelimiter);

with

const encodedDelimiters = { '/': '%25%2f', '\\': '%25%5c' };
function encodeComponentDelimiter(delimiter: string) {
  return encodedDelimiters[delimiter];
}

* @param path - The path to apply the escaping on.
* @returns A copy of the input path that is safe to apply URL encoding on.
*/
function escapeCodecExceptionsForEncoding(path: string): string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
function escapeCodecExceptionsForEncoding(path: string): string {
function preventDelimiterEncoding(pathComponent: string): string {

Copy link
Member

@joachimvh joachimvh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not much to add besides Ruben's comment except I would extend the integration tests more to really make sure there are no potential issues here.

@@ -115,14 +149,16 @@ export function toCanonicalUriPath(path: string): string {
* Decodes all components of a URI path.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have some more comments here (and in encodeUriPathComponents) to specify exactly that we use these functions to convert between URIs and file paths.

@@ -0,0 +1,54 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same config as server-memory.json but with a file backend right? Would just call it server-file.json then.

await app.stop();
});

it('can put a document for which the URI path contains url encoded separator characters.', async(): Promise<void> => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a check here that the resource is not accessible through ${baseUrl}/c1/c2/t1/. Also check that the new resource is not a container by doing a GET and checking the headers.

expect(res.headers.get('location')).toBe(`${baseUrl}${slug}`);
});

it('prevents accessing a document via a different identifier that results in the same path after url decoding.',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would also add the reverse check just to be sure.


const rootFilePath = getTestFolder('file-backend-encoded-slash-handling');

describe('A server with a file backend storage', (): void => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For each test I would add an additional check that looks at the file name on disk since we know what it should be and where it should be located.

@wkerckho
Copy link
Contributor Author

I've updated my changes based on your feedback! (Sorry for the delay)

const pathComponentDelimiters = [ '/', '\\' ];
// Regex to find all instances of encoded path component delimiters.
const encodedDelimiterRegex = new RegExp(
`${pathComponentDelimiters.map((item): string => encodeURIComponent(item)).join('|')}`, 'giu',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd do

Suggested change
`${pathComponentDelimiters.map((item): string => encodeURIComponent(item)).join('|')}`, 'giu',
`${pathComponentDelimiters.map((delimiter): string => encodeURIComponent(item)).join('|')}`, 'giu',

throughout (especially with the arrays below at 17 and 22) but that's a crazy nit

@wkerckho
Copy link
Contributor Author

Renamed item argument to delimiter in the sections Ruben mentioned.
Rebased from latest main (now that v4 has been merged).

@RubenVerborgh
Copy link
Member

Super-clean now. Love it ❤️

@joachimvh joachimvh merged commit dbdb9b4 into main Apr 21, 2022
@joachimvh joachimvh deleted the fix/issue-1184 branch April 21, 2022 13:17
@bourgeoa
Copy link
Contributor

If I understand well, the above code only avoid %conversion of %2F

Trying the same concept on NSS. I stumble on the following issue : I run tests with the 2 different %encoded slash %2f and %2F and both where needed. They both %decode to /

The except encode/decode function should consider both cases.
I am not sure but it seems that you must also check for double %encode/decode because decoding %252F --> %2F and this is not %re-encoded.

@RubenVerborgh
Copy link
Member

I run tests with the 2 different %encoded slash %2f and %2F and both where needed.

Clever—lemme fix that indeed!

@bourgeoa
Copy link
Contributor

Here is my PR nodeSolidServer/node-solid-server#1690 can you have a look

@RubenVerborgh
Copy link
Member

@bourgeoa That's quite cool; I learnt a clever trick there!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

semver.patch Does not require a minor or major version bump

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants