Skip to content

Commit dfd05e3

Browse files
dschogitster
authored andcommitted
filter-branch: Big syntax change; support rewriting multiple refs
We used to take the first non-option argument as the name for the new branch. This syntax is not extensible to support rewriting more than just HEAD. Instead, we now have the following syntax: git filter-branch [<filter options>...] [<rev-list options>] All positive refs given in <rev-list options> are rewritten. Yes, in-place. If a ref was changed, the original head is stored in refs/original/$ref now, for your inspecting pleasure, in addition to the reflogs (since it is easier to inspect "git show-ref | grep original" than to inspect all the reflogs). This commit also adds the --force option to remove .git-rewrite/ and all refs from refs/original/ before filtering. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
1 parent 3b38ec1 commit dfd05e3

File tree

3 files changed

+182
-60
lines changed

3 files changed

+182
-60
lines changed

Documentation/git-filter-branch.txt

Lines changed: 28 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ SYNOPSIS
1212
[--index-filter <command>] [--parent-filter <command>]
1313
[--msg-filter <command>] [--commit-filter <command>]
1414
[--tag-name-filter <command>] [--subdirectory-filter <directory>]
15-
[-d <directory>] <new-branch-name> [<rev-list options>...]
15+
[-d <directory>] [-f | --force] [<rev-list options>...]
1616

1717
DESCRIPTION
1818
-----------
@@ -26,10 +26,9 @@ information) will be preserved.
2626
The command takes the new branch name as a mandatory argument and
2727
the filters as optional arguments. If you specify no filters, the
2828
commits will be recommitted without any changes, which would normally
29-
have no effect and result in the new branch pointing to the same
30-
branch as your current branch. Nevertheless, this may be useful in
31-
the future for compensating for some git bugs or such, therefore
32-
such a usage is permitted.
29+
have no effect. Nevertheless, this may be useful in the future for
30+
compensating for some git bugs or such, therefore such a usage is
31+
permitted.
3332

3433
*WARNING*! The rewritten history will have different object names for all
3534
the objects and will not converge with the original branch. You will not
@@ -38,8 +37,9 @@ original branch. Please do not use this command if you do not know the
3837
full implications, and avoid using it anyway, if a simple single commit
3938
would suffice to fix your problem.
4039

41-
Always verify that the rewritten version is correct before disposing
42-
the original branch.
40+
Always verify that the rewritten version is correct: The original refs,
41+
if different from the rewritten ones, will be stored in the namespace
42+
'refs/original/'.
4343

4444
Note that since this operation is extensively I/O expensive, it might
4545
be a good idea to redirect the temporary directory off-disk, e.g. on
@@ -142,6 +142,11 @@ definition impossible to preserve signatures at any rate.)
142142
does this in the '.git-rewrite/' directory but you can override
143143
that choice by this parameter.
144144

145+
-f\|--force::
146+
`git filter-branch` refuses to start with an existing temporary
147+
directory or when there are already refs starting with
148+
'refs/original/', unless forced.
149+
145150
<rev-list-options>::
146151
When options are given after the new branch name, they will
147152
be passed to gitlink:git-rev-list[1]. Only commits in the resulting
@@ -156,14 +161,14 @@ Suppose you want to remove a file (containing confidential information
156161
or copyright violation) from all commits:
157162

158163
-------------------------------------------------------
159-
git filter-branch --tree-filter 'rm filename' newbranch
164+
git filter-branch --tree-filter 'rm filename' HEAD
160165
-------------------------------------------------------
161166

162167
A significantly faster version:
163168

164-
-------------------------------------------------------------------------------
165-
git filter-branch --index-filter 'git update-index --remove filename' newbranch
166-
-------------------------------------------------------------------------------
169+
--------------------------------------------------------------------------
170+
git filter-branch --index-filter 'git update-index --remove filename' HEAD
171+
--------------------------------------------------------------------------
167172

168173
Now, you will get the rewritten history saved in the branch 'newbranch'
169174
(your current branch is left untouched).
@@ -172,25 +177,25 @@ To set a commit (which typically is at the tip of another
172177
history) to be the parent of the current initial commit, in
173178
order to paste the other history behind the current history:
174179

175-
------------------------------------------------------------------------
176-
git filter-branch --parent-filter 'sed "s/^\$/-p <graft-id>/"' newbranch
177-
------------------------------------------------------------------------
180+
-------------------------------------------------------------------
181+
git filter-branch --parent-filter 'sed "s/^\$/-p <graft-id>/"' HEAD
182+
-------------------------------------------------------------------
178183

179184
(if the parent string is empty - therefore we are dealing with the
180185
initial commit - add graftcommit as a parent). Note that this assumes
181186
history with a single root (that is, no merge without common ancestors
182187
happened). If this is not the case, use:
183188

184-
-------------------------------------------------------------------------------
189+
--------------------------------------------------------------------------
185190
git filter-branch --parent-filter \
186-
'cat; test $GIT_COMMIT = <commit-id> && echo "-p <graft-id>"' newbranch
187-
-------------------------------------------------------------------------------
191+
'cat; test $GIT_COMMIT = <commit-id> && echo "-p <graft-id>"' HEAD
192+
--------------------------------------------------------------------------
188193

189194
or even simpler:
190195

191196
-----------------------------------------------
192197
echo "$commit-id $graft-id" >> .git/info/grafts
193-
git filter-branch newbranch $graft-id..
198+
git filter-branch $graft-id..HEAD
194199
-----------------------------------------------
195200

196201
To remove commits authored by "Darl McBribe" from the history:
@@ -208,7 +213,7 @@ git filter-branch --commit-filter '
208213
done;
209214
else
210215
git commit-tree "$@";
211-
fi' newbranch
216+
fi' HEAD
212217
------------------------------------------------------------------------------
213218

214219
The shift magic first throws away the tree id and then the -p
@@ -238,14 +243,14 @@ A--B-----C
238243
To rewrite only commits D,E,F,G,H, but leave A, B and C alone, use:
239244

240245
--------------------------------
241-
git filter-branch ... new-H C..H
246+
git filter-branch ... C..H
242247
--------------------------------
243248

244249
To rewrite commits E,F,G,H, use one of these:
245250

246251
----------------------------------------
247-
git filter-branch ... new-H C..H --not D
248-
git filter-branch ... new-H D..H --not C
252+
git filter-branch ... C..H --not D
253+
git filter-branch ... D..H --not C
249254
----------------------------------------
250255

251256
To move the whole tree into a subdirectory, or remove it from there:
@@ -255,7 +260,7 @@ git filter-branch --index-filter \
255260
'git ls-files -s | sed "s-\t-&newsubdir/-" |
256261
GIT_INDEX_FILE=$GIT_INDEX_FILE.new \
257262
git update-index --index-info &&
258-
mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE' directorymoved
263+
mv $GIT_INDEX_FILE.new $GIT_INDEX_FILE' HEAD
259264
---------------------------------------------------------------
260265

261266

git-filter-branch.sh

Lines changed: 128 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -78,13 +78,20 @@ filter_msg=cat
7878
filter_commit='git commit-tree "$@"'
7979
filter_tag_name=
8080
filter_subdir=
81+
orig_namespace=refs/original/
82+
force=
8183
while case "$#" in 0) usage;; esac
8284
do
8385
case "$1" in
8486
--)
8587
shift
8688
break
8789
;;
90+
--force|-f)
91+
shift
92+
force=t
93+
continue
94+
;;
8895
-*)
8996
;;
9097
*)
@@ -126,24 +133,43 @@ do
126133
--subdirectory-filter)
127134
filter_subdir="$OPTARG"
128135
;;
136+
--original)
137+
orig_namespace="$OPTARG"
138+
;;
129139
*)
130140
usage
131141
;;
132142
esac
133143
done
134144

135-
dstbranch="$1"
136-
shift
137-
test -n "$dstbranch" || die "missing branch name"
138-
git show-ref "refs/heads/$dstbranch" 2> /dev/null &&
139-
die "branch $dstbranch already exists"
140-
141-
test ! -e "$tempdir" || die "$tempdir already exists, please remove it"
145+
case "$force" in
146+
t)
147+
rm -rf "$tempdir"
148+
;;
149+
'')
150+
test -d "$tempdir" &&
151+
die "$tempdir already exists, please remove it"
152+
esac
142153
mkdir -p "$tempdir/t" &&
154+
tempdir="$(cd "$tempdir"; pwd)" &&
143155
cd "$tempdir/t" &&
144156
workdir="$(pwd)" ||
145157
die ""
146158

159+
# Make sure refs/original is empty
160+
git for-each-ref > "$tempdir"/backup-refs
161+
while read sha1 type name
162+
do
163+
case "$force,$name" in
164+
,$orig_namespace*)
165+
die "Namespace $orig_namespace not empty"
166+
;;
167+
t,$orig_namespace*)
168+
git update-ref -d "$name" $sha1
169+
;;
170+
esac
171+
done < "$tempdir"/backup-refs
172+
147173
case "$GIT_DIR" in
148174
/*)
149175
;;
@@ -153,6 +179,29 @@ case "$GIT_DIR" in
153179
esac
154180
export GIT_DIR GIT_WORK_TREE=.
155181

182+
# These refs should be updated if their heads were rewritten
183+
184+
git rev-parse --revs-only --symbolic "$@" |
185+
while read ref
186+
do
187+
# normalize ref
188+
case "$ref" in
189+
HEAD)
190+
ref="$(git symbolic-ref "$ref")"
191+
;;
192+
refs/*)
193+
;;
194+
*)
195+
ref="$(git for-each-ref --format='%(refname)' |
196+
grep /"$ref")"
197+
esac
198+
199+
git check-ref-format "$ref" && echo "$ref"
200+
done > "$tempdir"/heads
201+
202+
test -s "$tempdir"/heads ||
203+
die "Which ref do you want to rewrite?"
204+
156205
export GIT_INDEX_FILE="$(pwd)/../index"
157206
git read-tree || die "Could not seed the index"
158207

@@ -174,6 +223,8 @@ commits=$(wc -l <../revs | tr -d " ")
174223

175224
test $commits -eq 0 && die "Found nothing to rewrite"
176225

226+
# Rewrite the commits
227+
177228
i=0
178229
while read commit parents; do
179230
i=$(($i+1))
@@ -234,22 +285,75 @@ while read commit parents; do
234285
$(git write-tree) $parentstr < ../message > ../map/$commit
235286
done <../revs
236287

237-
src_head=$(tail -n 1 ../revs | sed -e 's/ .*//')
238-
target_head=$(head -n 1 ../map/$src_head)
239-
case "$target_head" in
240-
'')
241-
echo Nothing rewritten
288+
# In case of a subdirectory filter, it is possible that a specified head
289+
# is not in the set of rewritten commits, because it was pruned by the
290+
# revision walker. Fix it by mapping these heads to the next rewritten
291+
# ancestor(s), i.e. the boundaries in the set of rewritten commits.
292+
293+
# NEEDSWORK: we should sort the unmapped refs topologically first
294+
while read ref
295+
do
296+
sha1=$(git rev-parse "$ref"^0)
297+
test -f "$workdir"/../map/$sha1 && continue
298+
# Assign the boundarie(s) in the set of rewritten commits
299+
# as the replacement commit(s).
300+
# (This would look a bit nicer if --not --stdin worked.)
301+
for p in $((cd "$workdir"/../map; ls | sed "s/^/^/") |
302+
git rev-list $ref --boundary --stdin |
303+
sed -n "s/^-//p")
304+
do
305+
map $p >> "$workdir"/../map/$sha1
306+
done
307+
done < "$tempdir"/heads
308+
309+
# Finally update the refs
310+
311+
_x40='[0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]'
312+
_x40="$_x40$_x40$_x40$_x40$_x40$_x40$_x40$_x40"
313+
count=0
314+
echo
315+
while read ref
316+
do
317+
# avoid rewriting a ref twice
318+
test -f "$orig_namespace$ref" && continue
319+
320+
sha1=$(git rev-parse "$ref"^0)
321+
rewritten=$(map $sha1)
322+
323+
test $sha1 = "$rewritten" &&
324+
warn "WARNING: Ref '$ref' is unchanged" &&
325+
continue
326+
327+
case "$rewritten" in
328+
'')
329+
echo "Ref '$ref' was deleted"
330+
git update-ref -m "filter-branch: delete" -d "$ref" $sha1 ||
331+
die "Could not delete $ref"
242332
;;
243-
*)
244-
git update-ref refs/heads/"$dstbranch" $target_head ||
245-
die "Could not update $dstbranch with $target_head"
246-
if [ $(wc -l <../map/$src_head) -gt 1 ]; then
247-
echo "WARNING: Your commit filter caused the head commit to expand to several rewritten commits. Only the first such commit was recorded as the current $dstbranch head but you will need to resolve the situation now (probably by manually merging the other commits). These are all the commits:" >&2
248-
sed 's/^/ /' ../map/$src_head >&2
249-
ret=1
250-
fi
333+
$_x40)
334+
echo "Ref '$ref' was rewritten"
335+
git update-ref -m "filter-branch: rewrite" \
336+
"$ref" $rewritten $sha1 ||
337+
die "Could not rewrite $ref"
251338
;;
252-
esac
339+
*)
340+
# NEEDSWORK: possibly add -Werror, making this an error
341+
warn "WARNING: '$ref' was rewritten into multiple commits:"
342+
warn "$rewritten"
343+
warn "WARNING: Ref '$ref' points to the first one now."
344+
rewritten=$(echo "$rewritten" | head -n 1)
345+
git update-ref -m "filter-branch: rewrite to first" \
346+
"$ref" $rewritten $sha1 ||
347+
die "Could not rewrite $ref"
348+
;;
349+
esac
350+
git update-ref -m "filter-branch: backup" "$orig_namespace$ref" $sha1
351+
count=$(($count+1))
352+
done < "$tempdir"/heads
353+
354+
# TODO: This should possibly go, with the semantics that all positive given
355+
# refs are updated, and their original heads stored in refs/original/
356+
# Filter tags
253357

254358
if [ "$filter_tag_name" ]; then
255359
git for-each-ref --format='%(objectname) %(objecttype) %(refname)' refs/tags |
@@ -286,6 +390,8 @@ fi
286390

287391
cd ../..
288392
rm -rf "$tempdir"
289-
printf "\nRewritten history saved to the $dstbranch branch\n"
393+
echo
394+
test $count -gt 0 && echo "These refs were rewritten:"
395+
git show-ref | grep ^"$orig_namespace"
290396

291397
exit $ret

0 commit comments

Comments
 (0)