-
Notifications
You must be signed in to change notification settings - Fork 2.6k
WIP: ENV-friendly libgit2 #4856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
89ef767 to
b0e7e75
Compare
include/git2/repository.h
Outdated
| * `git_repository_open_with_opts`: | ||
| * | ||
| * - If `GIT_REPOSITORY_OPEN_FROM_ENV` is set, all other flags as well as the | ||
| * `ceiling_dirs` argument will be ignored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One might argue that in the future, we could potentially grow additional flags that will be honored even if FROM_ENV is set. In that case, we should explicitly list all flags that do get ignored. But yeah, this future is not there yet so I'd be fine to just keep and document behaviour like you have it right here
include/git2/repository.h
Outdated
| * When GIT_REPOSITORY_OPEN_FROM_ENV is set, this can be used to | ||
| * whitelist the environment variables we're allowed to access. | ||
| * | ||
| * An empty list means that we can use any of them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather say "we can use all of them if the above flag is set." to make it a tad clearer. Because I'm already confused whether any means "all" or "none" -- code below definitely says "none".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like I've confused myself too : a NULL whitelist means "any", an empty one means "none". I fail to see the use case for "empty" so I'm not sure if it makes sense. I'll fix & rephrase though.
src/repository.c
Outdated
|
|
||
| if (!start_path) { | ||
| error = git__getenv(&dir_buf, "GIT_DIR"); | ||
| error = git__getenv_wl(&dir_buf, "GIT_DIR", &opts->allowed_env); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this behave opposite to what the documentation says? This pointer will never be a NULL pointer, and in case where the user has not added any whitelisted entries then it will always return GIT_ENOTFOUND.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allowed_env is initialized by either the backward-compatibility code in git_repository_open_ext, or explicitly by the user via the new options struct, which defaults to NULL. So unless I'm missing something this should behave as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disregard that, I mistook pointers for non-pointers 🙄.
src/repository.c
Outdated
| flags |= GIT_REPOSITORY_OPEN_NO_SEARCH; | ||
| flags |= GIT_REPOSITORY_OPEN_NO_DOTGIT; | ||
| opts->flags |= GIT_REPOSITORY_OPEN_NO_SEARCH; | ||
| opts->flags |= GIT_REPOSITORY_OPEN_NO_DOTGIT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, it's a bit weird to modify the flags passed in by callers
include/git2/repository.h
Outdated
| * the search for a containing repository should terminate. | ||
| * | ||
| * FIXME: this depends on the value of GIT_PATH_LIST_SEPARATOR, which is | ||
| * platform-specific. Make that a git_strarray ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's perfectly fine to fix this API while we're at it. The old function could simply split the string it got into the string array
0dc1bec to
318b6c6
Compare
318b6c6 to
c755ad7
Compare
c755ad7 to
f8a4e2e
Compare
f8a4e2e to
18f308a
Compare
Previously, when `GIT_REPOSITORY_OPEN_FROM_ENV` was used, we'd end up calling `git_repository_open_ext` recursively with another set of flags and ceiling_dirs. This prevents us from supporting some environment variables (eg. GIT_CONFIG) because part of `git_repository_open_ext` would touch the config before we'd get a chance to tweak it. AFAICT `GIT_WORK_TREE` and `GIT_COMMON_DIR` might have encountered the same problem. Values from the environment are now stored in a struct, so those can applied as soon as possible while the repository is being setup.
The goal is to make the list of allowed envvars user-specified, so users can specify what we're allowed to use from the environment. This also freezes the current list uses when opening in `_open_ext`, for backward-compatibility reasons.
It seems the find-workdir-from-gitdir was broken. The first find_env will setup something, which the rest of the code will trample over.
LIBRARIES is the (absolute?) path to the library. LDFLAGS is the full linker stanza to correctly link with this lib. By passing LIBRARIES as LIBGIT_LIBS, the linker ends up with the absolute path for the SDK'ed version of CoreFoundation (which doesn't exist), instead of the familiar `-framework CoreFoundation`.
18f308a to
19bcdec
Compare
tiennou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebased, as I've wanted to try to polish it a little, and hit a snag. This is still WIP, but the current code really needs a refactor, and I don't think I can handle it without re-reading a whole lot of git manpages, so…
This should be green, but as I had to comment "innocuous looking" code to fix it, I had to look more closely at what was going on. I've left review comments about those issues, in case someone else has any insight. Note that there are a bunch of missing oom checks in there, which I didn't attempt to fix yet.
| git_buf_set(&env->gitdir_path, path.ptr, path.size); | ||
| GIT_ERROR_CHECK_ALLOC_BUF(&env->gitdir_path); | ||
|
|
||
| git_buf_attach(&env->gitlink_path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reads a (potentially non-existing) $GIT_DIR/gitdir file in gitlink.
| git_path_dirname_r(&env->workdir_path, path.ptr); | ||
| git_path_to_dir(&env->workdir_path); | ||
| } | ||
| if (git_buf_oom(&env->workdir_path)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, we set workdir_path as whatever dirname(path) is.
| GIT_ERROR_CHECK_ALLOC(repo->commondir); | ||
| } | ||
|
|
||
| // if (git_buf_len(&env.workdir_path) > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added this, since it seemed to be provided/detected. This causes a crash further down, hence the commenting.
| GIT_ERROR_CHECK_ALLOC(repo->configfile); | ||
| } | ||
|
|
||
| if ((error = repo_is_worktree(&is_worktree, repo)) < 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could have easily been part of repo_env_find, since it only look at the filesystem.
| if (config && | ||
| ((error = load_config_data(repo, config)) < 0 || | ||
| (error = load_workdir(repo, config, &workdir)) < 0)) | ||
| (error = load_workdir(repo, config, &env.workdir_path)) < 0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We move on to actually setup the working directory, at https://github.com/libgit2/libgit2/pull/4856/files#diff-7cbc97c802e21b199f176b493cec8e9cR300, and we pass the workdir identified by the "find" step. This is the part where the crash I'm getting comes from, and I'm starting to get completely lost (and I can't inline-review that code anyway, so there goes) :
core.worktreeis loaded from the config, as it can now be made use of.- if
repo->is_worktreeis true (it only performed filesystem-checks), we readrepo->gitdir+ "gitdir" (already loaded while finding inenv->gitlink),dirname()that, and use the result as our workdir. - if
core.worktreeis set, we check its value and use that as our workdir. - if the
parent_pathis passed, we check that it's a directory and use that. - finally, we
dirname(repo->gitdir)and use that.
It seems that core.worktree gets a say iif the repository structure is slightly "wrong" — a $GIT_DIR/gitdir file that doesn't point to a valid repo. The fact that it is passing as a parameter the very thing this code should be setting up doesn't help 😉.
This is a work-in-progress implementation of the design I proposed in #4815. For consistency, here's a summary :
The current
git_repository_open_extlacks the flexibility we need to protect users from changes to the API w.r.t environment variable we might start supporting. So, this addsgit_repository_open_with_opts, which supersedes it, and adds a user-configurable whitelist of what we're allowed to customize using the environment. This also modifies how we load values from the environment, so we can apply any of those allowed envvars while the repository is opened (mostly to supportGIT_CONFIG, as the previous implementation would have already loaded and used the config before we could override it). Opening bare repositories while using the environment is now available.I'm labeling it as WIP because I didn't write tests for the new behavior yet, as I have a few points worth discussing:
as per the last commit in the PR, I propose removing the
git_strarray ceiling_dirsin favor of agit_strarray. Since its value depends on the platform-specific path separator, I feel like we're doing a disservice to users by not handling that more transparently. Sincegit_repository_open_with_optsis brand new, now's the time to do it.I've discovered that
GIT_REPOSITORY_OPEN_FROM_ENV's documentation said this :This is now supported, which means I've changed how
git_repository_from_extbehaves for unsuspecting users, which was the whole point of the implementation (😁). I have no strong opinion on the matter, we just need to choose whether those envvars are used or not w.r.t to backward-compat.I tried to minimize changes to the "find_repo" function, but I haven't checked for work that might be done for nothing, in case the environment specifies everything. IMHO it would not be worthwhile.
Documentation-checking. From the top of my head, the semantics of what should happen when
ceiling_dirsis specified as both a parameter and its envvar form. I'm leaning toward using the more explicit value (ie. listening to the code vs. the environment) in that case.