-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Migrate TensorBoard to argparse #1240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This change introduces an API for launching TensorBoard via Python:
import tensorboard as tb
tb.program.launch(logdir='/tmp/mnist')
A TBLoader API has been introduced. This is a TBPlugin factory. It unifies our
patterns for instantiating and conditionally-loading plugins. It also allows
plugins to define their own CLI flags, without needing the FLAGS global.
Certain Python APIs needed to be broken. This should only impact plugin authors
in third party repositories, who are creating custom static builds of
TensorBoard. The tensorboard-plugin-example repository will be updated shortly
to demonstrate how to migrate, which should be straightforward.
| specified. Also logs an error message. | ||
| """ | ||
| if host is None: | ||
| host = FLAGS.host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're removing this default-handling logic, should we remove the parameter default values and update the docstring "Args" descriptions too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
tensorboard/program.py
Outdated
| sys.stderr.flush() | ||
| server.serve_forever() | ||
| def _make_flags(plugin_loaders, **kwargs): | ||
| args = () if kwargs else sys.argv[1:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean kwargs is like an override of sys.argv[] values? A docstring would help since it's a little unclear what this function does otherwise.
Also, right now if you pass in kwargs, then args will be empty - shouldn't we skip the argument parsing stage entirely in that case and just return an argparse.Namespace consisting of kwargs? Otherwise, when we get down to calling parser.parse_args() on the empty args tuple and then call fix_flags(), if fix_flags() makes some flags required, we'll error out at that point. E.g. the --logdir or --db check would always fail there, unless I'm missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's documented in the public functions. I don't write docs for private APIs. I think it's a good incentive to create less publicly maintained API surface area. Google style Python docstrings are also super burdensome to write.
Also, fixed, re: fix_flags being called after kwargs override.
tensorboard/program.py
Outdated
| defaults to doing a fresh parse. If kwargs is empty, sys.argv will | ||
| be taken into consideration. Otherwise, kwargs must be nonempty | ||
| for only Python-specified data structures to be used. | ||
| unused_argv: The unparsed arguments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call this "unparsed_argv" or something? It's a little confusing as "unused_argv" since while it is indeed unused within the function itself, that makes me wonder why we're passing it in at all. I had to go read the calling code to understand that it has to be here to absorb "leftover" args if they exist even though we don't use them.
Also, maybe we should add a check that there aren't any leftover arguments, just to catch mistakes if people e.g. forget to write --logdir in front of the logdir path itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't unfortunately, because these might end up being the name of the Python script itself, depending on how the program is invoked. It'd take more work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, couldn't we just add in a dummy first argument for the case where we aren't going via app.run()?
tensorboard/program.py
Outdated
|
|
||
|
|
||
| def launch(plugin_loaders=None, assets_zip_provider=None, **kwargs): | ||
| """Python API for launching TensorBoard. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So is this envisioned as a way to launch a TensorBoard thread within an existing binary? Does this have a use case today? A little more docstring would be helpful, especially on how this differs from main().
Also, could this share more logic with main()? E.g. the plugin_loader calls and the WSGI app construction are all pretty much the same.
The --inspect handling logic is different, but I wonder if there's a way we can avoid that structurally, since it's not really ideal to have to construct all the plugins in order to parse args, only to realize --inspect is there and then exit. Maybe we could do two passes - for example, create an ArgumentParser with just the very high-level TensorBoard args that affect whether it even launches a server (like --inspect) and then call parse_known_args() early, and then pipe the remaining args through for full parsing if we continue on the main execution path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to see if it'll work in colab.
Not special-casing inspect here would likely entail creating an API for plugins to define sub-commands for the TensorBoard command. I viewed that as mission creep for the purposes of this change.
tensorboard/program.py
Outdated
| plugin_loaders: A list of TBLoader plugin loader instances. If not | ||
| specified, defaults to first-party plugins. | ||
| assets_zip_provider: Delegates to TBContext or uses default if None. | ||
| flags: The argparse.Namespace object of CLI flags. This value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems safer to pass in the explicit flags by just passing a dict to flags instead of argparse.Namespace, versus absorbing all the extra kwargs automatically. Otherwise in the future if we change the arguments to this function, it could collide with user-specified flags-as-keyword-arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might lead to confusion later on with the context object, where flags are accessed using object.foo notation.
I like constraining flag names to the allowable Python symbol characters. So I thought of a better solution: roll these methods into a class. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the TensorBoard class idea overall. That said, the way flags get passed around still seems a little more complicated than necessary. What if we just added an extract_flags() or configure() method to the API like this?
# in run_main()
server = program.TensorBoard(loaders, assets)
unparsed_args = server.configure() # or server.configure(sys.argv) if we want to be explicit
if app is None:
sys.exit(server.main())
else:
app.run(server.main, unparsed_args)
# in colab
server = program.TensorBoard(loaders, assets)
server.configure(logdir=my_logdir, port=7007)
server.launch()Then configure() would basically be an exposed version of _get_flags(), like this:
def configure(self, argv=None, **kwargs):
args = () if kwargs else (argv or sys.argv)[1:]
parser = argparse.ArgumentParser(prog='tensorboard', description='TensorBoard is ...')
for loader in self.plugin_loaders:
loader.define_flags(parser)
self._flags, unparsed = parser.parse_known_args(args)
for k, v in kwargs.items():
if hasattr(self._flags, k):
raise ValueError('Unknown TensorBoard flag: %s' % k)
setattr(self._flags, k, v)
for loader in self.plugin_loaders:
loader.fix_flags(self._flags)
return unparsedThat way there's a single method on TensorBoard that deals with flags, and the code in run_main() can also be consolidated to use that logic rather than duplicating the plugin registration loop.
| Returns: | ||
| A DebuggerPlugin instance or None if it couldn't be loaded. | ||
| """ | ||
| # Check that not both grpc port flags are specified. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment isn't quite right; that check actually happens in fix_flags() above. Here we're checking whether either one is specified (since if neither is present, we don't create the plugin).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved.
tensorboard/plugins/base_plugin.py
Outdated
| pass | ||
|
|
||
| def fix_flags(self, flags): | ||
| """Allows flag values to be corrected after parsing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "corrected or validated" since some of the plugins also do validation that throws exceptions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
tensorboard/main.py
Outdated
| for loader in loaders: | ||
| loader.define_flags(parser) | ||
| if app is None: | ||
| flags, unparsed = parser.parse_args() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think parse_args() only returns the args object, not a tuple - you have to configure an nargs argument if you want to collected the remaining args, or use parse_known_args().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not actually seeing this as fixed, unless I missed something?
tensorboard/main.py
Outdated
| if app is None: | ||
| sys.exit(main(unparsed)) | ||
| else: | ||
| app.run(main, sys.argv[:1] + unparsed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, would it be possible to do the app.run() call as soon as possible and then do our own argument parsing inside main()? For the internal case, it seems like best practice to call app.run() - and therefore InitGoogle - as soon as possible, and especially before we start calling the loaders which could be random plugin code doing stuff which might be affected by InitGoogle(), etc.
Also, if we do the argument parsing inside main(), maybe there's a way to reduce some of the duplication with program.main() - at least it seems like we could use the program.main() defaults for assets and plugin loaders here, and maybe the plugin flag handling logic could be more shared as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
app.run() will abort if it encounters unrecognized flags. They need to be removed beforehand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. I found this proposal to absl flags, it sounds like it would let us pass in our own flag parsing logic using argparse: abseil/abseil-py#27 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Abseil libraries are still very new. They don't currently offer API or ABI stability. Titus' ambition with the project is to popularize the "develop at head" model. I support what he's doing, but I'd like to wait and see before making it a required dependency.
| def load(self, context): | ||
| """Creates CorePlugin instance.""" | ||
| return CorePlugin(context) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: there's 1 blank line too many here (per google style)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
jart
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL
tensorboard/main.py
Outdated
| for loader in loaders: | ||
| loader.define_flags(parser) | ||
| if app is None: | ||
| flags, unparsed = parser.parse_args() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
tensorboard/main.py
Outdated
| if app is None: | ||
| sys.exit(main(unparsed)) | ||
| else: | ||
| app.run(main, sys.argv[:1] + unparsed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
app.run() will abort if it encounters unrecognized flags. They need to be removed beforehand.
tensorboard/plugins/base_plugin.py
Outdated
| pass | ||
|
|
||
| def fix_flags(self, flags): | ||
| """Allows flag values to be corrected after parsing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| def load(self, context): | ||
| """Creates CorePlugin instance.""" | ||
| return CorePlugin(context) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| Returns: | ||
| A DebuggerPlugin instance or None if it couldn't be loaded. | ||
| """ | ||
| # Check that not both grpc port flags are specified. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved.
tensorboard/program.py
Outdated
| plugin_loaders: A list of TBLoader plugin loader instances. If not | ||
| specified, defaults to first-party plugins. | ||
| assets_zip_provider: Delegates to TBContext or uses default if None. | ||
| flags: The argparse.Namespace object of CLI flags. This value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might lead to confusion later on with the context object, where flags are accessed using object.foo notation.
I like constraining flag names to the allowable Python symbol characters. So I thought of a better solution: roll these methods into a class. What do you think?
tensorboard/program.py
Outdated
| defaults to doing a fresh parse. If kwargs is empty, sys.argv will | ||
| be taken into consideration. Otherwise, kwargs must be nonempty | ||
| for only Python-specified data structures to be used. | ||
| unused_argv: The unparsed arguments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't unfortunately, because these might end up being the name of the Python script itself, depending on how the program is invoked. It'd take more work.
tensorboard/program.py
Outdated
|
|
||
|
|
||
| def launch(plugin_loaders=None, assets_zip_provider=None, **kwargs): | ||
| """Python API for launching TensorBoard. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to see if it'll work in colab.
Not special-casing inspect here would likely entail creating an API for plugins to define sub-commands for the TensorBoard command. I viewed that as mission creep for the purposes of this change.
| specified. Also logs an error message. | ||
| """ | ||
| if host is None: | ||
| host = FLAGS.host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
tensorboard/program.py
Outdated
| sys.stderr.flush() | ||
| server.serve_forever() | ||
| def _make_flags(plugin_loaders, **kwargs): | ||
| args = () if kwargs else sys.argv[1:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's documented in the public functions. I don't write docs for private APIs. I think it's a good incentive to create less publicly maintained API surface area. Google style Python docstrings are also super burdensome to write.
Also, fixed, re: fix_flags being called after kwargs override.
tensorboard/program.py
Outdated
| plugin_loaders: A list of TBLoader plugin loader instances. If not | ||
| specified, defaults to first-party plugins. | ||
| assets_zip_provider: Delegates to TBContext or uses default if None. | ||
| flags: The argparse.Namespace object of CLI flags. This value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the TensorBoard class idea overall. That said, the way flags get passed around still seems a little more complicated than necessary. What if we just added an extract_flags() or configure() method to the API like this?
# in run_main()
server = program.TensorBoard(loaders, assets)
unparsed_args = server.configure() # or server.configure(sys.argv) if we want to be explicit
if app is None:
sys.exit(server.main())
else:
app.run(server.main, unparsed_args)
# in colab
server = program.TensorBoard(loaders, assets)
server.configure(logdir=my_logdir, port=7007)
server.launch()Then configure() would basically be an exposed version of _get_flags(), like this:
def configure(self, argv=None, **kwargs):
args = () if kwargs else (argv or sys.argv)[1:]
parser = argparse.ArgumentParser(prog='tensorboard', description='TensorBoard is ...')
for loader in self.plugin_loaders:
loader.define_flags(parser)
self._flags, unparsed = parser.parse_known_args(args)
for k, v in kwargs.items():
if hasattr(self._flags, k):
raise ValueError('Unknown TensorBoard flag: %s' % k)
setattr(self._flags, k, v)
for loader in self.plugin_loaders:
loader.fix_flags(self._flags)
return unparsedThat way there's a single method on TensorBoard that deals with flags, and the code in run_main() can also be consolidated to use that logic rather than duplicating the plugin registration loop.
tensorboard/program.py
Outdated
| defaults to doing a fresh parse. If kwargs is empty, sys.argv will | ||
| be taken into consideration. Otherwise, kwargs must be nonempty | ||
| for only Python-specified data structures to be used. | ||
| unused_argv: The unparsed arguments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, couldn't we just add in a dummy first argument for the case where we aren't going via app.run()?
tensorboard/program.py
Outdated
| socket.error: If a server could not be constructed with the host | ||
| and port specified. Also logs an error message. | ||
| :type plugin_loaders: list[base_plugin.TBLoader] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two lines are obsolete
tensorboard/program.py
Outdated
| Raises: | ||
| ValueError: If flag values are invalid. | ||
| :type plugin_loaders: list[base_plugin.TBLoader] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two lines are obsolete
tensorboard/program.py
Outdated
| parser = argparse.ArgumentParser() | ||
| for loader in self.plugin_loaders: | ||
| loader.define_flags(parser) | ||
| flags, unparsed = parser.parse_args(args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this also has to be parse_known_args(), right? Otherwise it won't return the unparsed flags?
tensorboard/main.py
Outdated
| for loader in loaders: | ||
| loader.define_flags(parser) | ||
| if app is None: | ||
| flags, unparsed = parser.parse_args() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not actually seeing this as fixed, unless I missed something?
|
PTAL. Good suggestion on the |
|
LGTM |
Summary: TensorBoard used to have an optional dependency on `absl-py`, so #1240 and #1409 were careful to only import `absl` if it was installed. But as of #1654 TensorBoard always requires `absl-py`, so we can promote the conditional import to a normal import. Test Plan: Running TensorBoard with no arguments still fails as expected; running with `--logdir` or `--version` still do the right things, too. wchargin-branch: absl-always
Summary: TensorBoard used to have an optional dependency on `absl-py`, so #1240 and #1409 were careful to only import `absl` if it was installed. But as of #1654 TensorBoard always requires `absl-py`, so we can promote the conditional import to a normal import. Test Plan: Running TensorBoard with no arguments still fails as expected; running with `--logdir` or `--version` still do the right things, too. wchargin-branch: absl-always
This change introduces an API for launching TensorBoard via Python:
A TBLoader API has been introduced. This is a TBPlugin factory. It unifies our
patterns for instantiating and conditionally-loading plugins. It also allows
plugins to define their own CLI flags, without needing the FLAGS global.
Certain Python APIs needed to be broken. This should only impact plugin authors
in third party repositories, who are creating custom static builds of
TensorBoard. The tensorboard-plugin-example repository will be updated shortly
to demonstrate how to migrate, which should be straightforward.