Skip to content
This repository was archived by the owner on Jul 18, 2025. It is now read-only.
This repository was archived by the owner on Jul 18, 2025. It is now read-only.

Proposal: machine share #179

@nathanleclaire

Description

@nathanleclaire

machine share

Abstract

machine is shaping up to be an excellent tool for quickly and painlessly creating and managing Docker hosts across a variety of virtualization platforms and hosting providers. For a variety of reasons, syncing files between the computer where machine commands are executed and the machines themselves is desirable.

Motivation

Containers are a great tool and unit of deployment and machine makes creating a place to run your containers easier than ever. There are, however, many reasons why one would want to share files which are not pre-baked into the Docker image from their machine client computer to a machine host. Some examples are:

  1. The machine host is VM on the user's laptop and they are developing a web application inside of a container. They want to develop with the source code bind-mounted inside of a container, but still edit on their computer using Sublime etc.
  2. The user wants to spin up 10 hosts in the cloud and do some scientific computing on them using Docker. They have a container with, say, the Python libraries they need, but they also need to push their .csv files up to the hosts from their laptop. This user does not know about the implementation details of machine or how to find the SSH keys for the hosts etc.
  3. There is some artifact of a build run on a machine (e.g. one or many compiled binaries) and the user wants to retrieve that artifact from the remote.

Like the rest of machine, it would be preferable to have 80% of use cases where this sort of thing happens integrated seamlessly into the machine workflow, while still providing enough flexibility to users who fall into the 20% not covered by the most common use cases.

Interface

After thinking about the mechanics and UX of this, I think we should favor explicitness over implicitness and err on the side of not creating accidental or implicit shares.

There are a few aspects to the story that should be considered.

Command Line

The syntax would be something like this:

$ pwd
/Users/nathanleclaire/website

$ machine ls
NAME           ACTIVE   DRIVER         STATE     URL
shareexample   *        digitalocean   Running   tcp://104.236.115.220:2376

$ machine ssh -c pwd
/root

$ machine share --driver rsync . /root/code
Sharing /Users/nathanleclaire/website from this computer to /root/code on host "shareexample"....

$ machine share ls
MACHINE      DRIVER SRC                           DEST
shareexample rsync  /Users/nathanleclaire/website /root/code

$ ls

$ echo foo >foo.txt

$ ls 
foo.txt

$ machine ssh -c "cat foo.txt"
cat: foo.txt: No such file or directory
FATA[0001] exit status 1

$ machine share push
[INFO] Pushing to remote...

$ machine ssh -c "cat foo.txt"
foo

$ machine share --driver scp / /root/client_home_dir 
ERR[0001] Sharing the home directory or folders outside of it is not allowed.  To override use --i-know-what-i-am-doing-i-swear

IMO we should forbid users from creating shares to or from outside of the home directory of the client or the remote host. There's a strong argument that the home directory itself should be banned from sharing as well, to prevent accidental sharing of files which should be moved around carefully such as ~/.ssh and, of course, ~/.docker. Also, clients could share directories to multiple locations, but any shares which point to the same destination on the remote would be disallowed.

Totally open to feedback and changes on the UI, this is just what I've come up with so far.

Drivers

There is a huge variety of ways to get files from point A to point B and back, so I'd propose a standard interface that drivers have to implement to be recognized as an option for sharing (just like we have done with virtualization / cloud platforms). The default would be something like scp (since it is so simple and is pretty ubiquitous) and users would be able to manually specify one as well. Users could pick the driver that suits their needs. Additionally it would allow the machine team to start with a simpler core and move forward later e.g. just rsync, scp, and vboxsf could be the options in the v1.0 and then later other drivers could be added.

Some possible drivers: scp, vboxsf, fusionsf, rsync, sshfs, nfs, samba

This would be useful because different use cases call for different ways of moving files around. NFS might work well for development in a VM, but you might want rsync if you are pushing large files to the server frequently, and so on.

Part of the interface for a share driver would be some sort of IsContractFulfilled() method which returns a boolean that indicates if the "contract" necessary for the share to work is fulfilled by both the client and remote host. This would allow us to, for instance, check if rsync is installed on the client machine and the remote host, and refuse to create a share if that is not the case. Likewise, it would prevent users from trying to do something silly like using the vboxsf driver on a non-Virtualbox host.

Possible Issues

  • If machine moves to a client-server model, which seems favored at the moment, it introduces additional complications to the implementation of machine share. Namely, machine share would be all client-side logic, and would not be able to make the same assumptions it may make today e.g. that the SSH keys for the hosts are all lying around ready to be used on the client computer.
  • machine share push and machine share pull make a ton of sense for some drivers, such as rsync, but not so much sense for vboxsf or nfs which update automatically. What happens when a user does machine share push on such a driver? "ERR: This driver is bidirectional"?
  • This is likely to have a pretty big scope, so we might want to consider making it a separate tool or introducing some sort of model for extensions that could keep the core really lean.
  • How are symlinks handled? Permissions? What happens if you move, rename, or delete the shared directory on the client, or on the remote?

Other Notes

I am not attached to the proposal, I am just looking for feedback and to see if anyone else thinks this would be useful. I have put together some code which defines a tentative ShareDriver interface but it is not really to my liking (and no real sync functionality is implemented) so I haven't shared it.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions