0

It's reasonably well known that POSIX has fairly mad path traversal semantics, mostly due to the existence of symlinks. For example to open("foo/../bar") the kernel can't just open("bar") because foo might be a symlink.

But I only know that due to various articles about it on the internet. There are other ambiguities, e.g. what if foo is an ordinary file. Is it then allowed to open("foo/../bar")?

What is the official algorithm for implementing path traversal in POSIX?

1 Answer 1

2

I guess slightly unsurprisingly it's in the POSIX standard, section 4.16 Pathname Resolution. Here is the most interesting part. There also this path resolution man page.

A pathname that contains at least one non-<slash> character and that ends with one or more trailing <slash> characters shall not be resolved successfully unless the last pathname component before the trailing <slash> characters resolves (with symbolic links followed—see below) to an existing directory or a directory entry that is to be created for a directory immediately after the pathname is resolved. Interfaces using pathname resolution may specify additional constraints1 when a pathname that does not name an existing directory contains at least one non-<slash> character and contains one or more trailing <slash> characters.

If a symbolic link is encountered during pathname resolution, the behavior shall depend on whether the pathname component is at the end of the pathname and on the function being performed. If all of the following are true, then pathname resolution is complete:

This is the last pathname component of the pathname. The pathname has no trailing <slash>. The function is required to act on the symbolic link itself, or certain arguments direct that the function act on the symbolic link itself. In all other cases, the system shall prefix the remaining pathname, if any, with the contents of the symbolic link, except that if the contents of the symbolic link is the empty string, then either pathname resolution shall fail with functions reporting an [ENOENT] error and utilities writing an equivalent diagnostic message, or the pathname of the directory containing the symbolic link shall be used in place of the contents of the symbolic link. If the contents of the symbolic link consist solely of <slash> characters, then all leading characters of the remaining pathname shall be omitted from the resulting combined pathname, leaving only the leading <slash> characters from the symbolic link contents. In the cases where prefixing occurs, if the combined length exceeds {PATH_MAX}, and the implementation considers this to be an error, pathname resolution shall fail with functions reporting an [ENAMETOOLONG] error and utilities writing an equivalent diagnostic message. Otherwise, the resolved pathname shall be the resolution of the pathname just created. If the resulting pathname does not begin with a <slash>, the predecessor of the first filename of the pathname is taken to be the directory containing the symbolic link.

If the system detects a loop in the pathname resolution process, pathname resolution shall fail with functions reporting an [ELOOP] error and utilities writing an equivalent diagnostic message. The same may happen if during the resolution process more symbolic links were followed than the implementation allows. This implementation-defined limit shall not be smaller than {SYMLOOP_MAX}.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.