Skip to content

Commit 3d5c418

Browse files
committed
git-clone: aggressively optimize local clone behaviour.
This changes the behaviour of cloning from a repository on the local machine, by defaulting to "-l" (use hardlinks to share files under .git/objects) and making "-l" a no-op. A new option, --no-hardlinks, is also added to cause file-level copy of files under .git/objects while still avoiding the normal "pack to pipe, then receive and index pack" network transfer overhead. The old behaviour of local cloning without -l nor -s is availble by specifying the source repository with the newly introduced file:///path/to/repo.git/ syntax (i.e. "same as network" cloning). * With --no-hardlinks (i.e. have all .git/objects/ copied via cpio) would not catch the source repository corruption, and also risks corrupted recipient repository if an alpha-particle hits memory cell while indexing and resolving deltas. As long as the recipient is created uncorrupted, you have a good back-up. * same-as-network is expensive, but it would catch the breakage of the source repository. It still risks corrupted recipient repository due to hardware failure. As long as the recipient is created uncorrupted, you have a good back-up. * The new default on the same filesystem, as long as the source repository is healthy, it is very likely that the recipient would be, too. Also it is very cheap. You do not get any back-up benefit, though. None of the method is resilient against the source repository corruption, so let's discount that from the comparison. Then the difference with and without --no-hardlinks matters primarily if you value the back-up benefit or not. If you want to use the cloned repository as a back-up, then it is cheaper to do a clone with --no-hardlinks and two git-fsck (source before clone, recipient after clone) than same-as-network clone, especially as you are likely to do a git-fsck on the recipient if you are so paranoid anyway. Which leads me to believe that being able to use file:/// is probably a good idea, if only for testability, but probably of little practical value. We default to hardlinked clone for everyday use, and paranoids can use --no-hardlinks as a way to make a back-up. Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 parent 72a4f4b commit 3d5c418

File tree

6 files changed

+79
-40
lines changed

6 files changed

+79
-40
lines changed

Documentation/git-clone.txt

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,8 @@ git-clone - Clone a repository into a new directory
99
SYNOPSIS
1010
--------
1111
[verse]
12-
'git-clone' [--template=<template_directory>] [-l [-s]] [-q] [-n] [--bare]
12+
'git-clone' [--template=<template_directory>]
13+
[-l] [-s] [--no-hardlinks] [-q] [-n] [--bare]
1314
[-o <name>] [-u <upload-pack>] [--reference <repository>]
1415
[--depth <depth>] <repository> [<directory>]
1516

@@ -40,8 +41,19 @@ OPTIONS
4041
this flag bypasses normal "git aware" transport
4142
mechanism and clones the repository by making a copy of
4243
HEAD and everything under objects and refs directories.
43-
The files under .git/objects/ directory are hardlinked
44-
to save space when possible.
44+
The files under `.git/objects/` directory are hardlinked
45+
to save space when possible. This is now the default when
46+
the source repository is specified with `/path/to/repo`
47+
syntax, so it essentially is a no-op option. To force
48+
copying instead of hardlinking (which may be desirable
49+
if you are trying to make a back-up of your repository),
50+
but still avoid the usual "git aware" transport
51+
mechanism, `--no-hardlinks` can be used.
52+
53+
--no-hardlinks::
54+
Optimize the cloning process from a repository on a
55+
local filesystem by copying files under `.git/objects`
56+
directory.
4557

4658
--shared::
4759
-s::

Documentation/urls.txt

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -15,20 +15,24 @@ to name the remote repository:
1515
- ssh://{startsb}user@{endsb}host.xz/~/path/to/repo.git
1616
===============================================================
1717

18-
SSH is the default transport protocol. You can optionally specify
19-
which user to log-in as, and an alternate, scp-like syntax is also
20-
supported. Both syntaxes support username expansion,
21-
as does the native git protocol. The following three are
22-
identical to the last three above, respectively:
18+
SSH is the default transport protocol over the network. You can
19+
optionally specify which user to log-in as, and an alternate,
20+
scp-like syntax is also supported. Both syntaxes support
21+
username expansion, as does the native git protocol. The following
22+
three are identical to the last three above, respectively:
2323

2424
===============================================================
2525
- {startsb}user@{endsb}host.xz:/path/to/repo.git/
2626
- {startsb}user@{endsb}host.xz:~user/path/to/repo.git/
2727
- {startsb}user@{endsb}host.xz:path/to/repo.git
2828
===============================================================
2929

30-
To sync with a local directory, use:
30+
To sync with a local directory, you can use:
3131

3232
===============================================================
3333
- /path/to/repo.git/
34+
- file:///path/to/repo.git/
3435
===============================================================
36+
37+
They are mostly equivalent, except when cloning. See
38+
gitlink:git-clone[1] for details.

git-clone.sh

Lines changed: 35 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ Perhaps git-update-server-info needs to be run there?"
8787

8888
quiet=
8989
local=no
90-
use_local=no
90+
use_local_hardlink=yes
9191
local_shared=no
9292
unset template
9393
no_checkout=
@@ -108,9 +108,13 @@ while
108108
no_checkout=yes ;;
109109
*,--na|*,--nak|*,--nake|*,--naked|\
110110
*,-b|*,--b|*,--ba|*,--bar|*,--bare) bare=yes ;;
111-
*,-l|*,--l|*,--lo|*,--loc|*,--loca|*,--local) use_local=yes ;;
111+
*,-l|*,--l|*,--lo|*,--loc|*,--loca|*,--local)
112+
use_local_hardlink=yes ;;
113+
*,--no-h|*,--no-ha|*,--no-har|*,--no-hard|*,--no-hardl|\
114+
*,--no-hardli|*,--no-hardlin|*,--no-hardlink|*,--no-hardlinks)
115+
use_local_hardlink=no ;;
112116
*,-s|*,--s|*,--sh|*,--sha|*,--shar|*,--share|*,--shared)
113-
local_shared=yes; use_local=yes ;;
117+
local_shared=yes; ;;
114118
1,--template) usage ;;
115119
*,--template)
116120
shift; template="--template=$1" ;;
@@ -249,34 +253,36 @@ fi
249253
rm -f "$GIT_DIR/CLONE_HEAD"
250254

251255
# We do local magic only when the user tells us to.
252-
case "$local,$use_local" in
253-
yes,yes)
256+
case "$local" in
257+
yes)
254258
( cd "$repo/objects" ) ||
255-
die "-l flag seen but repository '$repo' is not local."
259+
die "cannot chdir to local '$repo/objects'."
256260

257-
case "$local_shared" in
258-
no)
259-
# See if we can hardlink and drop "l" if not.
260-
sample_file=$(cd "$repo" && \
261-
find objects -type f -print | sed -e 1q)
262-
263-
# objects directory should not be empty since we are cloning!
264-
test -f "$repo/$sample_file" || exit
265-
266-
l=
267-
if ln "$repo/$sample_file" "$GIT_DIR/objects/sample" 2>/dev/null
268-
then
269-
l=l
270-
fi &&
271-
rm -f "$GIT_DIR/objects/sample" &&
272-
cd "$repo" &&
273-
find objects -depth -print | cpio -pumd$l "$GIT_DIR/" || exit 1
274-
;;
275-
yes)
276-
mkdir -p "$GIT_DIR/objects/info"
277-
echo "$repo/objects" >> "$GIT_DIR/objects/info/alternates"
278-
;;
279-
esac
261+
if test "$local_shared" = yes
262+
then
263+
mkdir -p "$GIT_DIR/objects/info"
264+
echo "$repo/objects" >>"$GIT_DIR/objects/info/alternates"
265+
else
266+
l= &&
267+
if test "$use_local_hardlink" = yes
268+
then
269+
# See if we can hardlink and drop "l" if not.
270+
sample_file=$(cd "$repo" && \
271+
find objects -type f -print | sed -e 1q)
272+
# objects directory should not be empty because
273+
# we are cloning!
274+
test -f "$repo/$sample_file" || exit
275+
if ln "$repo/$sample_file" "$GIT_DIR/objects/sample" 2>/dev/null
276+
then
277+
rm -f "$GIT_DIR/objects/sample"
278+
l=l
279+
else
280+
echo >&2 "Warning: -l asked but cannot hardlink to $repo"
281+
fi
282+
fi &&
283+
cd "$repo" &&
284+
find objects -depth -print | cpio -pumd$l "$GIT_DIR/" || exit 1
285+
fi
280286
git-ls-remote "$repo" >"$GIT_DIR/CLONE_HEAD" || exit 1
281287
;;
282288
*)

t/t5500-fetch-pack.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -129,7 +129,7 @@ pull_to_client 2nd "B" $((64*3))
129129

130130
pull_to_client 3rd "A" $((1*3)) # old fails
131131

132-
test_expect_success "clone shallow" "git-clone --depth 2 . shallow"
132+
test_expect_success "clone shallow" "git-clone --depth 2 file://`pwd`/. shallow"
133133

134134
(cd shallow; git count-objects -v) > count.shallow
135135

t/t5700-clone-reference.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ diff expected current'
5151
cd "$base_dir"
5252

5353
test_expect_success 'cloning with reference (no -l -s)' \
54-
'git clone --reference B A D'
54+
'git clone --reference B file://`pwd`/A D'
5555

5656
cd "$base_dir"
5757

t/t5701-clone-local.sh

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,4 +43,21 @@ test_expect_success 'local clone from x.git that does not exist' '
4343
fi
4444
'
4545

46+
test_expect_success 'With -no-hardlinks, local will make a copy' '
47+
cd "$D" &&
48+
git clone --bare --no-hardlinks x w &&
49+
cd w &&
50+
linked=$(find objects -type f ! -links 1 | wc -l) &&
51+
test "$linked" = 0
52+
'
53+
54+
test_expect_success 'Even without -l, local will make a hardlink' '
55+
cd "$D" &&
56+
rm -fr w &&
57+
git clone -l --bare x w &&
58+
cd w &&
59+
copied=$(find objects -type f -links 1 | wc -l) &&
60+
test "$copied" = 0
61+
'
62+
4663
test_done

0 commit comments

Comments
 (0)