Skip to content

Conversation

@ethomson
Copy link
Member

@ethomson ethomson commented Jun 5, 2020

The notion of "global" state has become somewhat confused over time; global.c handles the libraries initialization and shutdown functions, state that's global to the library and thread-local data. This is an attempt to clean this up, breaking each of these notions out into its own separate section instead of keeping them comingled and simplifying the initialization and shutdown of the library by using our existing patterns in places that we had not done that.

  • git_libgit2_init sets up global state for a number of subsystems (it sets up the allocator functions, does hash initialization, etc). Most of these subsystems have their init functions set up in an init_callbacks array, and their init functions register shutdown callbacks. But not all; some have initialization scattered around the git_libgit2_init machinery. Move them all into the same init/shutdown pattern:

    • thread initialization
    • win32 debug stack
    • thread-local data (error state, etc)
    • user-configurable settings (user agent string, etc)
    • mwindow mutex
  • The "global state" (GIT_GLOBAL) is really thread-local state and contains things like the current thread's error data. Rename it accordingly, so that it's easier to reason about.

  • Moves the current thread's return data on Windows (for pthread_exit emulation) out of the shared TLS data structure, and managing it with the rest of the threading initialization/teardown routines.

  • Adds a "runtime" layer to handle the init/shutdown functions, so that we git_runtime_shutdown_register to register a shutdown handler with the runtime (instead of git__shutdown_register - the git__ name feels like an antipattern)

@ethomson ethomson force-pushed the ethomson/init branch 2 times, most recently from e283d93 to e90b4dc Compare June 5, 2020 09:13
Copy link
Member

@pks-t pks-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a huge fan of these refactorings, thanks a lot for working on them!

@ethomson ethomson force-pushed the ethomson/init branch 2 times, most recently from 8c8e473 to dcb3b86 Compare July 11, 2020 12:15
@ethomson ethomson force-pushed the ethomson/init branch 10 times, most recently from 8b1e1ba to f266347 Compare July 12, 2020 11:48
extern size_t git_mwindow__window_size;
extern size_t git_mwindow__mapped_limit;
extern size_t git_indexer__max_objects;
extern bool git_disable_pack_keep_file_checks;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sometimes wonder if it would improve readability if those were labelled better, like e.g. calling them git_settings_mwindow_window_size. That'd make it much more obvious it's a global setting.

Anyway, definitely nothing we have to do as part of your PR.

src/libgit2.c Outdated
return git_runtime_shutdown();
}

/* Library settings */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving stuff like version or feature information definitely makes a ton of sense to me. But I'm not sure I'm a huge fan of moving settings over, too. I kind of get why you're doing it, but if you ask me we should instead work on clearly separating settings and make them more standalone. Like clearly identifying options with git_setting_user_agent, git_setting_ssl_ciphers etc.

WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with you here, and I agree that this is continuing to grow in scope. Let's tackle this in a followup PR?

@pks-t
Copy link
Member

pks-t commented Jul 12, 2020

@ethomson It's becoming quite hard to review due to the PRs growing scope. Would it be a huge burden to ask you to split up this PR into multiple ones to get those parts in more quickly? Fine with a "no" if you think the effort for this is too high.

@ethomson ethomson force-pushed the ethomson/init branch 2 times, most recently from 6f53beb to 6d7ab77 Compare July 12, 2020 20:18
@ethomson
Copy link
Member Author

@ethomson It's becoming quite hard to review due to the PRs growing scope. Would it be a huge burden to ask you to split up this PR into multiple ones to get those parts in more quickly? Fine with a "no" if you think the effort for this is too high.

Sure. I split out the renaming of crtdbg into #5580.

@ethomson ethomson force-pushed the ethomson/init branch 2 times, most recently from 6e41dd4 to 2381982 Compare July 12, 2020 21:15
@pks-t
Copy link
Member

pks-t commented Jul 13, 2020 via email

@ethomson
Copy link
Member Author

Fair enough. Is the PR currently in a state where it makes sense to do
another review?

Yes I think so

Instead of treating win32 thread initialization specially in the win32
git_libgit2_init function, add a git_global_threads_init function.
Move the MSVC C runtime debugging bits into the allocator's global init
function.
Move the settings global data teardown into its own separate function,
instead of intermingled with the global state.
Move the mwindow mutex into the mwindow code itself, initializing it in
the mwindow global initialization function instead of in the global
initializer.
Ensure that we can allocate the error message buffer.  In keeping with
our typical policiess, we allow (small) memory leaks in the case where
we're out of memory.
Our "global initialization" has accumulated some debris over the years.
It was previously responsible for both running the various global
initializers (that set up various subsystems) _and_ setting up the
"global state", which is actually the thread-local state for things
like error reporting.

Separate the thread local state out into "threadstate".  Use the normal
subsystem initialization functions that we already have to set it up.
This makes both the global initialization system and the threadstate
system simpler to reason about.
We were never properly testing git_thread_exit.  Do so.
We want to store a pointer to emulate `pthread_exit` on Windows.  Do
this within the threading infrastructure so that it could potentially be
re-used outside of the context of libgit2 itself.
Provide a mechanism for system components to register for initialization
and shutdown of the libgit2 runtime.
Now that we've identified that our global settings really aren't global
at all, and refactored the library to match that, change global.c to
libgit2.c, which is especially nice since the prefix of the functions
matches the filename.
@ethomson
Copy link
Member Author

I renamed the thread-local data that we carry around as part of our "global" state to "threadstate". Upon seeing #5602, it's clear that we should have a generic implementation of an internal TLS API so that (eg) checkout can use it and that "tlsdata" is extremely ambiguous in this regard.

@ethomson
Copy link
Member Author

Since this is largely an internal refactoring, and I think that I addressed all of the feedback, I'm going to merge this to unblock #5602.

@ethomson ethomson merged commit 20450cb into master Oct 14, 2020
@ethomson ethomson deleted the ethomson/init branch December 11, 2020 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants