Skip to content

Commit b6f3481

Browse files
committed
Teach fast-import to recursively copy files/directories
Some source material (e.g. Subversion dump files) perform directory renames by telling us the directory was copied, then deleted in the same revision. This makes it difficult for a frontend to convert such data formats to a fast-import stream, as all the frontend has on hand is "Copy a/ to b/; Delete a/" with no details about what files are in a/, unless the frontend also kept track of all files. The new 'C' subcommand within a commit allows the frontend to make a recursive copy of one path to another path within the branch, without needing to keep track of the individual file paths. The metadata copy is performed in memory efficiently, but is implemented as a copy-immediately operation, rather than copy-on-write. With this new 'C' subcommand frontends could obviously implement an 'R' (rename) on their own as a combination of 'C' and 'D' (delete), but since we have already offered up 'R' in the past and it is a trivial thing to keep implemented I'm not going to deprecate it. Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
1 parent 48b4c3d commit b6f3481

File tree

3 files changed

+195
-9
lines changed

3 files changed

+195
-9
lines changed

Documentation/git-fast-import.txt

Lines changed: 35 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -302,7 +302,7 @@ change to the project.
302302
data
303303
('from' SP <committish> LF)?
304304
('merge' SP <committish> LF)?
305-
(filemodify | filedelete | filerename | filedeleteall)*
305+
(filemodify | filedelete | filecopy | filerename | filedeleteall)*
306306
LF
307307
....
308308

@@ -325,13 +325,13 @@ commit message use a 0 length data. Commit messages are free-form
325325
and are not interpreted by Git. Currently they must be encoded in
326326
UTF-8, as fast-import does not permit other encodings to be specified.
327327

328-
Zero or more `filemodify`, `filedelete`, `filerename` and
329-
`filedeleteall` commands
328+
Zero or more `filemodify`, `filedelete`, `filecopy`, `filerename`
329+
and `filedeleteall` commands
330330
may be included to update the contents of the branch prior to
331331
creating the commit. These commands may be supplied in any order.
332332
However it is recommended that a `filedeleteall` command preceed
333-
all `filemodify` and `filerename` commands in the same commit, as
334-
`filedeleteall`
333+
all `filemodify`, `filecopy` and `filerename` commands in the same
334+
commit, as `filedeleteall`
335335
wipes the branch clean (see below).
336336

337337
`author`
@@ -497,6 +497,27 @@ here `<path>` is the complete path of the file or subdirectory to
497497
be removed from the branch.
498498
See `filemodify` above for a detailed description of `<path>`.
499499

500+
`filecopy`
501+
^^^^^^^^^^^^
502+
Recursively copies an existing file or subdirectory to a different
503+
location within the branch. The existing file or directory must
504+
exist. If the destination exists it will be completely replaced
505+
by the content copied from the source.
506+
507+
....
508+
'C' SP <path> SP <path> LF
509+
....
510+
511+
here the first `<path>` is the source location and the second
512+
`<path>` is the destination. See `filemodify` above for a detailed
513+
description of what `<path>` may look like. To use a source path
514+
that contains SP the path must be quoted.
515+
516+
A `filecopy` command takes effect immediately. Once the source
517+
location has been copied to the destination any future commands
518+
applied to the source location will not impact the destination of
519+
the copy.
520+
500521
`filerename`
501522
^^^^^^^^^^^^
502523
Renames an existing file or subdirectory to a different location
@@ -517,6 +538,15 @@ location has been renamed to the destination any future commands
517538
applied to the source location will create new files there and not
518539
impact the destination of the rename.
519540

541+
Note that a `filerename` is the same as a `filecopy` followed by a
542+
`filedelete` of the source location. There is a slight performance
543+
advantage to using `filerename`, but the advantage is so small
544+
that it is never worth trying to convert a delete/add pair in
545+
source material into a rename for fast-import. This `filerename`
546+
command is provided just to simplify frontends that already have
547+
rename information and don't want bother with decomposing it into a
548+
`filecopy` followed by a `filedelete`.
549+
520550
`filedeleteall`
521551
^^^^^^^^^^^^^^^
522552
Included in a `commit` command to remove all files (and also all

fast-import.c

Lines changed: 77 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,10 +26,16 @@ Format of STDIN stream:
2626
lf;
2727
commit_msg ::= data;
2828
29-
file_change ::= file_clr | file_del | file_rnm | file_obm | file_inm;
29+
file_change ::= file_clr
30+
| file_del
31+
| file_rnm
32+
| file_cpy
33+
| file_obm
34+
| file_inm;
3035
file_clr ::= 'deleteall' lf;
3136
file_del ::= 'D' sp path_str lf;
3237
file_rnm ::= 'R' sp path_str sp path_str lf;
38+
file_cpy ::= 'C' sp path_str sp path_str lf;
3339
file_obm ::= 'M' sp mode sp (hexsha1 | idnum) sp path_str lf;
3440
file_inm ::= 'M' sp mode sp 'inline' sp path_str lf
3541
data;
@@ -623,6 +629,31 @@ static void release_tree_entry(struct tree_entry *e)
623629
avail_tree_entry = e;
624630
}
625631

632+
static struct tree_content *dup_tree_content(struct tree_content *s)
633+
{
634+
struct tree_content *d;
635+
struct tree_entry *a, *b;
636+
unsigned int i;
637+
638+
if (!s)
639+
return NULL;
640+
d = new_tree_content(s->entry_count);
641+
for (i = 0; i < s->entry_count; i++) {
642+
a = s->entries[i];
643+
b = new_tree_entry();
644+
memcpy(b, a, sizeof(*a));
645+
if (a->tree && is_null_sha1(b->versions[1].sha1))
646+
b->tree = dup_tree_content(a->tree);
647+
else
648+
b->tree = NULL;
649+
d->entries[i] = b;
650+
}
651+
d->entry_count = s->entry_count;
652+
d->delta_depth = s->delta_depth;
653+
654+
return d;
655+
}
656+
626657
static void start_packfile(void)
627658
{
628659
static char tmpfile[PATH_MAX];
@@ -1273,6 +1304,43 @@ static int tree_content_remove(
12731304
return 1;
12741305
}
12751306

1307+
static int tree_content_get(
1308+
struct tree_entry *root,
1309+
const char *p,
1310+
struct tree_entry *leaf)
1311+
{
1312+
struct tree_content *t = root->tree;
1313+
const char *slash1;
1314+
unsigned int i, n;
1315+
struct tree_entry *e;
1316+
1317+
slash1 = strchr(p, '/');
1318+
if (slash1)
1319+
n = slash1 - p;
1320+
else
1321+
n = strlen(p);
1322+
1323+
for (i = 0; i < t->entry_count; i++) {
1324+
e = t->entries[i];
1325+
if (e->name->str_len == n && !strncmp(p, e->name->str_dat, n)) {
1326+
if (!slash1) {
1327+
memcpy(leaf, e, sizeof(*leaf));
1328+
if (e->tree && is_null_sha1(e->versions[1].sha1))
1329+
leaf->tree = dup_tree_content(e->tree);
1330+
else
1331+
leaf->tree = NULL;
1332+
return 1;
1333+
}
1334+
if (!S_ISDIR(e->versions[1].mode))
1335+
return 0;
1336+
if (!e->tree)
1337+
load_tree(e);
1338+
return tree_content_get(e, slash1 + 1, leaf);
1339+
}
1340+
}
1341+
return 0;
1342+
}
1343+
12761344
static int update_branch(struct branch *b)
12771345
{
12781346
static const char *msg = "fast-import";
@@ -1658,7 +1726,7 @@ static void file_change_d(struct branch *b)
16581726
free(p_uq);
16591727
}
16601728

1661-
static void file_change_r(struct branch *b)
1729+
static void file_change_cr(struct branch *b, int rename)
16621730
{
16631731
const char *s, *d;
16641732
char *s_uq, *d_uq;
@@ -1694,7 +1762,10 @@ static void file_change_r(struct branch *b)
16941762
}
16951763

16961764
memset(&leaf, 0, sizeof(leaf));
1697-
tree_content_remove(&b->branch_tree, s, &leaf);
1765+
if (rename)
1766+
tree_content_remove(&b->branch_tree, s, &leaf);
1767+
else
1768+
tree_content_get(&b->branch_tree, s, &leaf);
16981769
if (!leaf.versions[1].mode)
16991770
die("Path %s not in branch", s);
17001771
tree_content_set(&b->branch_tree, d,
@@ -1874,7 +1945,9 @@ static void cmd_new_commit(void)
18741945
else if (!prefixcmp(command_buf.buf, "D "))
18751946
file_change_d(b);
18761947
else if (!prefixcmp(command_buf.buf, "R "))
1877-
file_change_r(b);
1948+
file_change_cr(b, 1);
1949+
else if (!prefixcmp(command_buf.buf, "C "))
1950+
file_change_cr(b, 0);
18781951
else if (!strcmp("deleteall", command_buf.buf))
18791952
file_change_deleteall(b);
18801953
else

t/t9300-fast-import.sh

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -648,4 +648,87 @@ test_expect_success \
648648
git diff-tree -M -r M3^ M3 >actual &&
649649
compare_diff_raw expect actual'
650650

651+
###
652+
### series N
653+
###
654+
655+
test_tick
656+
cat >input <<INPUT_END
657+
commit refs/heads/N1
658+
committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
659+
data <<COMMIT
660+
file copy
661+
COMMIT
662+
663+
from refs/heads/branch^0
664+
C file2/newf file2/n.e.w.f
665+
666+
INPUT_END
667+
668+
cat >expect <<EOF
669+
:100755 100755 f1fb5da718392694d0076d677d6d0e364c79b0bc f1fb5da718392694d0076d677d6d0e364c79b0bc C100 file2/newf file2/n.e.w.f
670+
EOF
671+
test_expect_success \
672+
'N: copy file in same subdirectory' \
673+
'git-fast-import <input &&
674+
git diff-tree -C --find-copies-harder -r N1^ N1 >actual &&
675+
compare_diff_raw expect actual'
676+
677+
cat >input <<INPUT_END
678+
commit refs/heads/N2
679+
committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
680+
data <<COMMIT
681+
clean directory copy
682+
COMMIT
683+
684+
from refs/heads/branch^0
685+
C file2 file3
686+
687+
commit refs/heads/N2
688+
committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
689+
data <<COMMIT
690+
modify directory copy
691+
COMMIT
692+
693+
M 644 inline file3/file5
694+
data <<EOF
695+
$file5_data
696+
EOF
697+
698+
INPUT_END
699+
700+
cat >expect <<EOF
701+
:100644 100644 fcf778cda181eaa1cbc9e9ce3a2e15ee9f9fe791 fcf778cda181eaa1cbc9e9ce3a2e15ee9f9fe791 C100 newdir/interesting file3/file5
702+
:100755 100755 f1fb5da718392694d0076d677d6d0e364c79b0bc f1fb5da718392694d0076d677d6d0e364c79b0bc C100 file2/newf file3/newf
703+
:100644 100644 7123f7f44e39be127c5eb701e5968176ee9d78b1 7123f7f44e39be127c5eb701e5968176ee9d78b1 C100 file2/oldf file3/oldf
704+
EOF
705+
test_expect_success \
706+
'N: copy then modify subdirectory' \
707+
'git-fast-import <input &&
708+
git diff-tree -C --find-copies-harder -r N2^^ N2 >actual &&
709+
compare_diff_raw expect actual'
710+
711+
cat >input <<INPUT_END
712+
commit refs/heads/N3
713+
committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
714+
data <<COMMIT
715+
dirty directory copy
716+
COMMIT
717+
718+
from refs/heads/branch^0
719+
M 644 inline file2/file5
720+
data <<EOF
721+
$file5_data
722+
EOF
723+
724+
C file2 file3
725+
D file2/file5
726+
727+
INPUT_END
728+
729+
test_expect_success \
730+
'N: copy dirty subdirectory' \
731+
'git-fast-import <input &&
732+
test `git-rev-parse N2^{tree}` = `git-rev-parse N3^{tree}`'
733+
651734
test_done

0 commit comments

Comments
 (0)