Migrate TensorBoard to argparse #1240

jart · 2018-06-12T04:21:09Z

This change introduces an API for launching TensorBoard via Python:

import tensorboard as tb
tb.program.launch(logdir='/tmp/mnist')

A TBLoader API has been introduced. This is a TBPlugin factory. It unifies our
patterns for instantiating and conditionally-loading plugins. It also allows
plugins to define their own CLI flags, without needing the FLAGS global.

Certain Python APIs needed to be broken. This should only impact plugin authors
in third party repositories, who are creating custom static builds of
TensorBoard. The tensorboard-plugin-example repository will be updated shortly
to demonstrate how to migrate, which should be straightforward.

This change introduces an API for launching TensorBoard via Python: import tensorboard as tb tb.program.launch(logdir='/tmp/mnist') A TBLoader API has been introduced. This is a TBPlugin factory. It unifies our patterns for instantiating and conditionally-loading plugins. It also allows plugins to define their own CLI flags, without needing the FLAGS global. Certain Python APIs needed to be broken. This should only impact plugin authors in third party repositories, who are creating custom static builds of TensorBoard. The tensorboard-plugin-example repository will be updated shortly to demonstrate how to migrate, which should be straightforward.

nfelt · 2018-06-18T23:34:39Z

tensorboard/program.py

      specified. Also logs an error message.
  """
-  if host is None:
-    host = FLAGS.host


If we're removing this default-handling logic, should we remove the parameter default values and update the docstring "Args" descriptions too?

nfelt · 2018-06-18T23:35:57Z

tensorboard/program.py

-  sys.stderr.flush()
-  server.serve_forever()
+def _make_flags(plugin_loaders, **kwargs):
+  args = () if kwargs else sys.argv[1:]


Does this mean kwargs is like an override of sys.argv[] values? A docstring would help since it's a little unclear what this function does otherwise.

Also, right now if you pass in kwargs, then args will be empty - shouldn't we skip the argument parsing stage entirely in that case and just return an argparse.Namespace consisting of kwargs? Otherwise, when we get down to calling parser.parse_args() on the empty args tuple and then call fix_flags(), if fix_flags() makes some flags required, we'll error out at that point. E.g. the --logdir or --db check would always fail there, unless I'm missing something?

It's documented in the public functions. I don't write docs for private APIs. I think it's a good incentive to create less publicly maintained API surface area. Google style Python docstrings are also super burdensome to write.

Also, fixed, re: fix_flags being called after kwargs override.

nfelt · 2018-06-18T23:38:33Z

tensorboard/program.py

+      defaults to doing a fresh parse. If kwargs is empty, sys.argv will
+      be taken into consideration. Otherwise, kwargs must be nonempty
+      for only Python-specified data structures to be used.
+    unused_argv: The unparsed arguments.


Maybe call this "unparsed_argv" or something? It's a little confusing as "unused_argv" since while it is indeed unused within the function itself, that makes me wonder why we're passing it in at all. I had to go read the calling code to understand that it has to be here to absorb "leftover" args if they exist even though we don't use them.

Also, maybe we should add a check that there aren't any leftover arguments, just to catch mistakes if people e.g. forget to write --logdir in front of the logdir path itself?

I can't unfortunately, because these might end up being the name of the Python script itself, depending on how the program is invoked. It'd take more work.

Hmm, couldn't we just add in a dummy first argument for the case where we aren't going via app.run()?

nfelt · 2018-06-18T23:46:25Z

tensorboard/program.py

+
+
+def launch(plugin_loaders=None, assets_zip_provider=None, **kwargs):
+  """Python API for launching TensorBoard.


So is this envisioned as a way to launch a TensorBoard thread within an existing binary? Does this have a use case today? A little more docstring would be helpful, especially on how this differs from main().

Also, could this share more logic with main()? E.g. the plugin_loader calls and the WSGI app construction are all pretty much the same.

The --inspect handling logic is different, but I wonder if there's a way we can avoid that structurally, since it's not really ideal to have to construct all the plugins in order to parse args, only to realize --inspect is there and then exit. Maybe we could do two passes - for example, create an ArgumentParser with just the very high-level TensorBoard args that affect whether it even launches a server (like --inspect) and then call parse_known_args() early, and then pipe the remaining args through for full parsing if we continue on the main execution path.

I want to see if it'll work in colab.

Not special-casing inspect here would likely entail creating an API for plugins to define sub-commands for the TensorBoard command. I viewed that as mission creep for the purposes of this change.

nfelt · 2018-06-18T23:52:43Z

tensorboard/program.py

+    plugin_loaders: A list of TBLoader plugin loader instances. If not
+      specified, defaults to first-party plugins.
    assets_zip_provider: Delegates to TBContext or uses default if None.
+    flags: The argparse.Namespace object of CLI flags. This value


It seems safer to pass in the explicit flags by just passing a dict to flags instead of argparse.Namespace, versus absorbing all the extra kwargs automatically. Otherwise in the future if we change the arguments to this function, it could collide with user-specified flags-as-keyword-arguments.

That might lead to confusion later on with the context object, where flags are accessed using object.foo notation.

I like constraining flag names to the allowable Python symbol characters. So I thought of a better solution: roll these methods into a class. What do you think?

I like the TensorBoard class idea overall. That said, the way flags get passed around still seems a little more complicated than necessary. What if we just added an extract_flags() or configure() method to the API like this?

# in run_main() server = program.TensorBoard(loaders, assets) unparsed_args = server.configure() # or server.configure(sys.argv) if we want to be explicit if app is None: sys.exit(server.main()) else: app.run(server.main, unparsed_args) # in colab server = program.TensorBoard(loaders, assets) server.configure(logdir=my_logdir, port=7007) server.launch()

Then configure() would basically be an exposed version of _get_flags(), like this:

def configure(self, argv=None, **kwargs): args = () if kwargs else (argv or sys.argv)[1:] parser = argparse.ArgumentParser(prog='tensorboard', description='TensorBoard is ...') for loader in self.plugin_loaders: loader.define_flags(parser) self._flags, unparsed = parser.parse_known_args(args) for k, v in kwargs.items(): if hasattr(self._flags, k): raise ValueError('Unknown TensorBoard flag: %s' % k) setattr(self._flags, k, v) for loader in self.plugin_loaders: loader.fix_flags(self._flags) return unparsed

That way there's a single method on TensorBoard that deals with flags, and the code in run_main() can also be consolidated to use that logic rather than duplicating the plugin registration loop.

nfelt · 2018-06-19T00:05:02Z

tensorboard/plugins/debugger/debugger_plugin_loader.py

+    Returns:
+      A DebuggerPlugin instance or None if it couldn't be loaded.
+    """
+    # Check that not both grpc port flags are specified.


This comment isn't quite right; that check actually happens in fix_flags() above. Here we're checking whether either one is specified (since if neither is present, we don't create the plugin).

nfelt · 2018-06-19T00:17:47Z

tensorboard/plugins/base_plugin.py

+    pass
+
+  def fix_flags(self, flags):
+    """Allows flag values to be corrected after parsing.


Maybe "corrected or validated" since some of the plugins also do validation that throws exceptions?

nfelt · 2018-06-19T00:38:05Z

tensorboard/main.py

+  for loader in loaders:
+    loader.define_flags(parser)
+  if app is None:
+    flags, unparsed = parser.parse_args()


I think parse_args() only returns the args object, not a tuple - you have to configure an nargs argument if you want to collected the remaining args, or use parse_known_args().

I'm not actually seeing this as fixed, unless I missed something?

nfelt · 2018-06-19T00:44:42Z

tensorboard/main.py

+  if app is None:
+    sys.exit(main(unparsed))
+  else:
+    app.run(main, sys.argv[:1] + unparsed)


Hmm, would it be possible to do the app.run() call as soon as possible and then do our own argument parsing inside main()? For the internal case, it seems like best practice to call app.run() - and therefore InitGoogle - as soon as possible, and especially before we start calling the loaders which could be random plugin code doing stuff which might be affected by InitGoogle(), etc.

Also, if we do the argument parsing inside main(), maybe there's a way to reduce some of the duplication with program.main() - at least it seems like we could use the program.main() defaults for assets and plugin loaders here, and maybe the plugin flag handling logic could be more shared as well?

app.run() will abort if it encounters unrecognized flags. They need to be removed beforehand.

Ah, I see. I found this proposal to absl flags, it sounds like it would let us pass in our own flag parsing logic using argparse: abseil/abseil-py#27 (comment)

The Abseil libraries are still very new. They don't currently offer API or ABI stability. Titus' ambition with the project is to popularize the "develop at head" model. I support what he's doing, but I'd like to wait and see before making it a required dependency.

nfelt · 2018-06-19T17:30:01Z

tensorboard/plugins/core/core_plugin.py

+  def load(self, context):
+    """Creates CorePlugin instance."""
+    return CorePlugin(context)
+


nit: there's 1 blank line too many here (per google style)

jart

PTAL

jart · 2018-06-19T17:45:10Z

tensorboard/main.py

+  for loader in loaders:
+    loader.define_flags(parser)
+  if app is None:
+    flags, unparsed = parser.parse_args()


jart · 2018-06-19T17:48:58Z

tensorboard/main.py

+  if app is None:
+    sys.exit(main(unparsed))
+  else:
+    app.run(main, sys.argv[:1] + unparsed)


app.run() will abort if it encounters unrecognized flags. They need to be removed beforehand.

jart · 2018-06-19T17:50:37Z

tensorboard/plugins/base_plugin.py

+    pass
+
+  def fix_flags(self, flags):
+    """Allows flag values to be corrected after parsing.


jart · 2018-06-19T17:51:10Z

tensorboard/plugins/core/core_plugin.py

+  def load(self, context):
+    """Creates CorePlugin instance."""
+    return CorePlugin(context)
+


jart · 2018-06-19T17:52:13Z

tensorboard/plugins/debugger/debugger_plugin_loader.py

+    Returns:
+      A DebuggerPlugin instance or None if it couldn't be loaded.
+    """
+    # Check that not both grpc port flags are specified.


jart · 2018-06-19T18:17:00Z

tensorboard/program.py

+    plugin_loaders: A list of TBLoader plugin loader instances. If not
+      specified, defaults to first-party plugins.
    assets_zip_provider: Delegates to TBContext or uses default if None.
+    flags: The argparse.Namespace object of CLI flags. This value


That might lead to confusion later on with the context object, where flags are accessed using object.foo notation.

I like constraining flag names to the allowable Python symbol characters. So I thought of a better solution: roll these methods into a class. What do you think?

jart · 2018-06-19T18:18:29Z

tensorboard/program.py

+      defaults to doing a fresh parse. If kwargs is empty, sys.argv will
+      be taken into consideration. Otherwise, kwargs must be nonempty
+      for only Python-specified data structures to be used.
+    unused_argv: The unparsed arguments.


I can't unfortunately, because these might end up being the name of the Python script itself, depending on how the program is invoked. It'd take more work.

jart · 2018-06-19T18:20:08Z

tensorboard/program.py

+
+
+def launch(plugin_loaders=None, assets_zip_provider=None, **kwargs):
+  """Python API for launching TensorBoard.


I want to see if it'll work in colab.

Not special-casing inspect here would likely entail creating an API for plugins to define sub-commands for the TensorBoard command. I viewed that as mission creep for the purposes of this change.

jart · 2018-06-19T18:43:15Z

tensorboard/program.py

      specified. Also logs an error message.
  """
-  if host is None:
-    host = FLAGS.host


jart · 2018-06-19T18:44:21Z

tensorboard/program.py

-  sys.stderr.flush()
-  server.serve_forever()
+def _make_flags(plugin_loaders, **kwargs):
+  args = () if kwargs else sys.argv[1:]


It's documented in the public functions. I don't write docs for private APIs. I think it's a good incentive to create less publicly maintained API surface area. Google style Python docstrings are also super burdensome to write.

Also, fixed, re: fix_flags being called after kwargs override.

nfelt · 2018-06-19T19:39:21Z

tensorboard/program.py

+    plugin_loaders: A list of TBLoader plugin loader instances. If not
+      specified, defaults to first-party plugins.
    assets_zip_provider: Delegates to TBContext or uses default if None.
+    flags: The argparse.Namespace object of CLI flags. This value


I like the TensorBoard class idea overall. That said, the way flags get passed around still seems a little more complicated than necessary. What if we just added an extract_flags() or configure() method to the API like this?

# in run_main() server = program.TensorBoard(loaders, assets) unparsed_args = server.configure() # or server.configure(sys.argv) if we want to be explicit if app is None: sys.exit(server.main()) else: app.run(server.main, unparsed_args) # in colab server = program.TensorBoard(loaders, assets) server.configure(logdir=my_logdir, port=7007) server.launch()

Then configure() would basically be an exposed version of _get_flags(), like this:

def configure(self, argv=None, **kwargs): args = () if kwargs else (argv or sys.argv)[1:] parser = argparse.ArgumentParser(prog='tensorboard', description='TensorBoard is ...') for loader in self.plugin_loaders: loader.define_flags(parser) self._flags, unparsed = parser.parse_known_args(args) for k, v in kwargs.items(): if hasattr(self._flags, k): raise ValueError('Unknown TensorBoard flag: %s' % k) setattr(self._flags, k, v) for loader in self.plugin_loaders: loader.fix_flags(self._flags) return unparsed

That way there's a single method on TensorBoard that deals with flags, and the code in run_main() can also be consolidated to use that logic rather than duplicating the plugin registration loop.

nfelt · 2018-06-19T19:40:09Z

tensorboard/program.py

+      defaults to doing a fresh parse. If kwargs is empty, sys.argv will
+      be taken into consideration. Otherwise, kwargs must be nonempty
+      for only Python-specified data structures to be used.
+    unused_argv: The unparsed arguments.


Hmm, couldn't we just add in a dummy first argument for the case where we aren't going via app.run()?

nfelt · 2018-06-19T21:48:23Z

tensorboard/program.py

+      socket.error: If a server could not be constructed with the host
+        and port specified. Also logs an error message.
+
+    :type plugin_loaders: list[base_plugin.TBLoader]


These two lines are obsolete

nfelt · 2018-06-19T21:48:52Z

tensorboard/program.py

+    Raises:
+      ValueError: If flag values are invalid.
+
+    :type plugin_loaders: list[base_plugin.TBLoader]


These two lines are obsolete

nfelt · 2018-06-19T21:49:54Z

tensorboard/program.py

+    parser = argparse.ArgumentParser()
+    for loader in self.plugin_loaders:
+      loader.define_flags(parser)
+    flags, unparsed = parser.parse_args(args)


I think this also has to be parse_known_args(), right? Otherwise it won't return the unparsed flags?

nfelt · 2018-06-19T21:50:20Z

tensorboard/main.py

+  for loader in loaders:
+    loader.define_flags(parser)
+  if app is None:
+    flags, unparsed = parser.parse_args()


I'm not actually seeing this as fixed, unless I missed something?

jart · 2018-06-20T00:59:57Z

PTAL. Good suggestion on the configure() API.

nfelt · 2018-06-20T21:41:56Z

LGTM

Summary: TensorBoard used to have an optional dependency on `absl-py`, so #1240 and #1409 were careful to only import `absl` if it was installed. But as of #1654 TensorBoard always requires `absl-py`, so we can promote the conditional import to a normal import. Test Plan: Running TensorBoard with no arguments still fails as expected; running with `--logdir` or `--version` still do the right things, too. wchargin-branch: absl-always

jart requested a review from nfelt June 12, 2018 04:21

jart force-pushed the argparse branch from bc1c880 to d1085db Compare June 12, 2018 04:36

nfelt approved these changes Jun 19, 2018

View reviewed changes

Fixes

b932bfa

jart commented Jun 19, 2018

View reviewed changes

nfelt reviewed Jun 19, 2018

View reviewed changes

More fixes

e456c16

jart merged commit d586c74 into tensorflow:master Jun 20, 2018

jart deleted the argparse branch June 20, 2018 21:53

tomtung mentioned this pull request Aug 9, 2018

Extension broken in Tensorboard 1.10 lspvic/jupyter_tensorboard#24

Closed

nfelt mentioned this pull request Sep 6, 2018

Extend program.TensorBoard API to allow other servers and absl flags #1409

Merged

stephanwlee mentioned this pull request Oct 23, 2018

Fix event_file flag when using inspect #1541

Merged

wchargin mentioned this pull request Oct 10, 2019

program: remove vestigial conditional absl import #2758

Merged



		def launch(plugin_loaders=None, assets_zip_provider=None, **kwargs):
		"""Python API for launching TensorBoard.

Migrate TensorBoard to argparse #1240

Migrate TensorBoard to argparse #1240

Uh oh!

Conversation

jart commented Jun 12, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jart left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment