Skip to content

Conversation

@tiennou
Copy link
Contributor

@tiennou tiennou commented Oct 21, 2018

This is a work-in-progress implementation of the design I proposed in #4815. For consistency, here's a summary :

The current git_repository_open_ext lacks the flexibility we need to protect users from changes to the API w.r.t environment variable we might start supporting. So, this adds git_repository_open_with_opts, which supersedes it, and adds a user-configurable whitelist of what we're allowed to customize using the environment. This also modifies how we load values from the environment, so we can apply any of those allowed envvars while the repository is opened (mostly to support GIT_CONFIG, as the previous implementation would have already loaded and used the config before we could override it). Opening bare repositories while using the environment is now available.

I'm labeling it as WIP because I didn't write tests for the new behavior yet, as I have a few points worth discussing:

  • as per the last commit in the PR, I propose removing the git_strarray ceiling_dirs in favor of a git_strarray. Since its value depends on the platform-specific path separator, I feel like we're doing a disservice to users by not handling that more transparently. Since git_repository_open_with_opts is brand new, now's the time to do it.

  • I've discovered that GIT_REPOSITORY_OPEN_FROM_ENV's documentation said this :

In the future, this flag will also cause git_repository_open_ext to
respect $GIT_WORK_TREE and $GIT_COMMON_DIR; currently,
git_repository_open_ext with this flag will error out if either
$GIT_WORK_TREE or $GIT_COMMON_DIR is set.

This is now supported, which means I've changed how git_repository_from_ext behaves for unsuspecting users, which was the whole point of the implementation (😁). I have no strong opinion on the matter, we just need to choose whether those envvars are used or not w.r.t to backward-compat.

  • I tried to minimize changes to the "find_repo" function, but I haven't checked for work that might be done for nothing, in case the environment specifies everything. IMHO it would not be worthwhile.

  • Documentation-checking. From the top of my head, the semantics of what should happen when ceiling_dirs is specified as both a parameter and its envvar form. I'm leaning toward using the more explicit value (ie. listening to the code vs. the environment) in that case.

@tiennou tiennou force-pushed the feature/repo-open-opts branch 6 times, most recently from 89ef767 to b0e7e75 Compare October 22, 2018 23:24
* `git_repository_open_with_opts`:
*
* - If `GIT_REPOSITORY_OPEN_FROM_ENV` is set, all other flags as well as the
* `ceiling_dirs` argument will be ignored.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One might argue that in the future, we could potentially grow additional flags that will be honored even if FROM_ENV is set. In that case, we should explicitly list all flags that do get ignored. But yeah, this future is not there yet so I'd be fine to just keep and document behaviour like you have it right here

* When GIT_REPOSITORY_OPEN_FROM_ENV is set, this can be used to
* whitelist the environment variables we're allowed to access.
*
* An empty list means that we can use any of them.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather say "we can use all of them if the above flag is set." to make it a tad clearer. Because I'm already confused whether any means "all" or "none" -- code below definitely says "none".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like I've confused myself too : a NULL whitelist means "any", an empty one means "none". I fail to see the use case for "empty" so I'm not sure if it makes sense. I'll fix & rephrase though.

src/repository.c Outdated

if (!start_path) {
error = git__getenv(&dir_buf, "GIT_DIR");
error = git__getenv_wl(&dir_buf, "GIT_DIR", &opts->allowed_env);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this behave opposite to what the documentation says? This pointer will never be a NULL pointer, and in case where the user has not added any whitelisted entries then it will always return GIT_ENOTFOUND.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allowed_env is initialized by either the backward-compatibility code in git_repository_open_ext, or explicitly by the user via the new options struct, which defaults to NULL. So unless I'm missing something this should behave as expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disregard that, I mistook pointers for non-pointers 🙄.

src/repository.c Outdated
flags |= GIT_REPOSITORY_OPEN_NO_SEARCH;
flags |= GIT_REPOSITORY_OPEN_NO_DOTGIT;
opts->flags |= GIT_REPOSITORY_OPEN_NO_SEARCH;
opts->flags |= GIT_REPOSITORY_OPEN_NO_DOTGIT;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, it's a bit weird to modify the flags passed in by callers

* the search for a containing repository should terminate.
*
* FIXME: this depends on the value of GIT_PATH_LIST_SEPARATOR, which is
* platform-specific. Make that a git_strarray ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's perfectly fine to fix this API while we're at it. The old function could simply split the string it got into the string array

@tiennou tiennou force-pushed the feature/repo-open-opts branch 2 times, most recently from 0dc1bec to 318b6c6 Compare January 3, 2019 22:41
@pks-t pks-t mentioned this pull request Feb 20, 2019
4 tasks
@tiennou tiennou force-pushed the feature/repo-open-opts branch from 318b6c6 to c755ad7 Compare May 2, 2019 00:14
@tiennou tiennou force-pushed the feature/repo-open-opts branch from c755ad7 to f8a4e2e Compare October 3, 2019 10:26
@tiennou tiennou force-pushed the feature/repo-open-opts branch from f8a4e2e to 18f308a Compare October 14, 2019 08:59
Previously, when `GIT_REPOSITORY_OPEN_FROM_ENV` was used, we'd end up
calling `git_repository_open_ext` recursively with another set of flags
and ceiling_dirs.

This prevents us from supporting some environment variables (eg. GIT_CONFIG)
because part of `git_repository_open_ext` would touch the config before
we'd get a chance to tweak it. AFAICT `GIT_WORK_TREE` and `GIT_COMMON_DIR`
might have encountered the same problem.

Values from the environment are now stored in a struct, so those can
applied as soon as possible while the repository is being setup.
The goal is to make the list of allowed envvars user-specified, so users
can specify what we're allowed to use from the environment.

This also freezes the current list uses when opening in `_open_ext`, for
backward-compatibility reasons.
It seems the find-workdir-from-gitdir was broken.
The first find_env will setup something, which the rest of the code
will trample over.
LIBRARIES is the (absolute?) path to the library.
LDFLAGS is the full linker stanza to correctly link with this lib.

By passing LIBRARIES as LIBGIT_LIBS, the linker ends up with the
absolute path for the SDK'ed version of CoreFoundation (which doesn't
exist), instead of the familiar `-framework CoreFoundation`.
@tiennou tiennou force-pushed the feature/repo-open-opts branch from 18f308a to 19bcdec Compare October 14, 2019 08:59
Copy link
Contributor Author

@tiennou tiennou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased, as I've wanted to try to polish it a little, and hit a snag. This is still WIP, but the current code really needs a refactor, and I don't think I can handle it without re-reading a whole lot of git manpages, so…

This should be green, but as I had to comment "innocuous looking" code to fix it, I had to look more closely at what was going on. I've left review comments about those issues, in case someone else has any insight. Note that there are a bunch of missing oom checks in there, which I didn't attempt to fix yet.

git_buf_set(&env->gitdir_path, path.ptr, path.size);
GIT_ERROR_CHECK_ALLOC_BUF(&env->gitdir_path);

git_buf_attach(&env->gitlink_path,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reads a (potentially non-existing) $GIT_DIR/gitdir file in gitlink.

git_path_dirname_r(&env->workdir_path, path.ptr);
git_path_to_dir(&env->workdir_path);
}
if (git_buf_oom(&env->workdir_path)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we set workdir_path as whatever dirname(path) is.

GIT_ERROR_CHECK_ALLOC(repo->commondir);
}

// if (git_buf_len(&env.workdir_path) > 0) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this, since it seemed to be provided/detected. This causes a crash further down, hence the commenting.

GIT_ERROR_CHECK_ALLOC(repo->configfile);
}

if ((error = repo_is_worktree(&is_worktree, repo)) < 0)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function.

This could have easily been part of repo_env_find, since it only look at the filesystem.

if (config &&
((error = load_config_data(repo, config)) < 0 ||
(error = load_workdir(repo, config, &workdir)) < 0))
(error = load_workdir(repo, config, &env.workdir_path)) < 0))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We move on to actually setup the working directory, at https://github.com/libgit2/libgit2/pull/4856/files#diff-7cbc97c802e21b199f176b493cec8e9cR300, and we pass the workdir identified by the "find" step. This is the part where the crash I'm getting comes from, and I'm starting to get completely lost (and I can't inline-review that code anyway, so there goes) :

  • core.worktree is loaded from the config, as it can now be made use of.
  • if repo->is_worktree is true (it only performed filesystem-checks), we read repo->gitdir + "gitdir" (already loaded while finding in env->gitlink), dirname() that, and use the result as our workdir.
  • if core.worktree is set, we check its value and use that as our workdir.
  • if the parent_path is passed, we check that it's a directory and use that.
  • finally, we dirname(repo->gitdir) and use that.

It seems that core.worktree gets a say iif the repository structure is slightly "wrong" — a $GIT_DIR/gitdir file that doesn't point to a valid repo. The fact that it is passing as a parameter the very thing this code should be setting up doesn't help 😉.

Base automatically changed from master to main January 7, 2021 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants