Skip to content

Commit 684dd4c

Browse files
matheustavaresdscho
andcommitted
checkout: fix bug that makes checkout follow symlinks in leading path
Before checking out a file, we have to confirm that all of its leading components are real existing directories. And to reduce the number of lstat() calls in this process, we cache the last leading path known to contain only directories. However, when a path collision occurs (e.g. when checking out case-sensitive files in case-insensitive file systems), a cached path might have its file type changed on disk, leaving the cache on an invalid state. Normally, this doesn't bring any bad consequences as we usually check out files in index order, and therefore, by the time the cached path becomes outdated, we no longer need it anyway (because all files in that directory would have already been written). But, there are some users of the checkout machinery that do not always follow the index order. In particular: checkout-index writes the paths in the same order that they appear on the CLI (or stdin); and the delayed checkout feature -- used when a long-running filter process replies with "status=delayed" -- postpones the checkout of some entries, thus modifying the checkout order. When we have to check out an out-of-order entry and the lstat() cache is invalid (due to a previous path collision), checkout_entry() may end up using the invalid data and thrusting that the leading components are real directories when, in reality, they are not. In the best case scenario, where the directory was replaced by a regular file, the user will get an error: "fatal: unable to create file 'foo/bar': Not a directory". But if the directory was replaced by a symlink, checkout could actually end up following the symlink and writing the file at a wrong place, even outside the repository. Since delayed checkout is affected by this bug, it could be used by an attacker to write arbitrary files during the clone of a maliciously crafted repository. Some candidate solutions considered were to disable the lstat() cache during unordered checkouts or sort the entries before passing them to the checkout machinery. But both ideas include some performance penalty and they don't future-proof the code against new unordered use cases. Instead, we now manually reset the lstat cache whenever we successfully remove a directory. Note: We are not even checking whether the directory was the same as the lstat cache points to because we might face a scenario where the paths refer to the same location but differ due to case folding, precomposed UTF-8 issues, or the presence of `..` components in the path. Two regression tests, with case-collisions and utf8-collisions, are also added for both checkout-index and delayed checkout. Note: to make the previously mentioned clone attack unfeasible, it would be sufficient to reset the lstat cache only after the remove_subtree() call inside checkout_entry(). This is the place where we would remove a directory whose path collides with the path of another entry that we are currently trying to check out (possibly a symlink). However, in the interest of a thorough fix that does not leave Git open to similar-but-not-identical attack vectors, we decided to intercept all `rmdir()` calls in one fell swoop. This addresses CVE-2021-21300. Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
1 parent df5be6d commit 684dd4c

File tree

7 files changed

+141
-3
lines changed

7 files changed

+141
-3
lines changed

cache.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1569,6 +1569,7 @@ extern int has_symlink_leading_path(const char *name, int len);
15691569
extern int threaded_has_symlink_leading_path(struct cache_def *, const char *, int);
15701570
extern int check_leading_path(const char *name, int len);
15711571
extern int has_dirs_only_path(const char *name, int len, int prefix_len);
1572+
extern void invalidate_lstat_cache(void);
15721573
extern void schedule_dir_for_removal(const char *name, int len);
15731574
extern void remove_scheduled_dirs(void);
15741575

compat/mingw.c

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -283,6 +283,8 @@ int mingw_rmdir(const char *pathname)
283283
ask_yes_no_if_possible("Deletion of directory '%s' failed. "
284284
"Should I try again?", pathname))
285285
ret = _wrmdir(wpathname);
286+
if (!ret)
287+
invalidate_lstat_cache();
286288
return ret;
287289
}
288290

git-compat-util.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -338,6 +338,11 @@ typedef uintmax_t timestamp_t;
338338
#define _PATH_DEFPATH "/usr/local/bin:/usr/bin:/bin"
339339
#endif
340340

341+
int lstat_cache_aware_rmdir(const char *path);
342+
#if !defined(__MINGW32__) && !defined(_MSC_VER)
343+
#define rmdir lstat_cache_aware_rmdir
344+
#endif
345+
341346
#ifndef has_dos_drive_prefix
342347
static inline int git_has_dos_drive_prefix(const char *path)
343348
{

symlinks.c

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -267,6 +267,13 @@ int has_dirs_only_path(const char *name, int len, int prefix_len)
267267
*/
268268
static int threaded_has_dirs_only_path(struct cache_def *cache, const char *name, int len, int prefix_len)
269269
{
270+
/*
271+
* Note: this function is used by the checkout machinery, which also
272+
* takes care to properly reset the cache when it performs an operation
273+
* that would leave the cache outdated. If this function starts caching
274+
* anything else besides FL_DIR, remember to also invalidate the cache
275+
* when creating or deleting paths that might be in the cache.
276+
*/
270277
return lstat_cache(cache, name, len,
271278
FL_DIR|FL_FULLPATH, prefix_len) &
272279
FL_DIR;
@@ -321,3 +328,20 @@ void remove_scheduled_dirs(void)
321328
{
322329
do_remove_scheduled_dirs(0);
323330
}
331+
332+
void invalidate_lstat_cache(void)
333+
{
334+
reset_lstat_cache(&default_cache);
335+
}
336+
337+
#undef rmdir
338+
int lstat_cache_aware_rmdir(const char *path)
339+
{
340+
/* Any change in this function must be made also in `mingw_rmdir()` */
341+
int ret = rmdir(path);
342+
343+
if (!ret)
344+
invalidate_lstat_cache();
345+
346+
return ret;
347+
}

t/t0021-conversion.sh

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -817,4 +817,49 @@ test_expect_success PERL 'invalid file in delayed checkout' '
817817
grep "error: external filter .* signaled that .unfiltered. is now available although it has not been delayed earlier" git-stderr.log
818818
'
819819

820+
for mode in 'case' 'utf-8'
821+
do
822+
case "$mode" in
823+
case) dir='A' symlink='a' mode_prereq='CASE_INSENSITIVE_FS' ;;
824+
utf-8)
825+
dir=$(printf "\141\314\210") symlink=$(printf "\303\244")
826+
mode_prereq='UTF8_NFD_TO_NFC' ;;
827+
esac
828+
829+
test_expect_success PERL,SYMLINKS,$mode_prereq \
830+
"delayed checkout with $mode-collision don't write to the wrong place" '
831+
test_config_global filter.delay.process \
832+
"\"$TEST_ROOT/rot13-filter.pl\" --always-delay delayed.log clean smudge delay" &&
833+
test_config_global filter.delay.required true &&
834+
835+
git init $mode-collision &&
836+
(
837+
cd $mode-collision &&
838+
mkdir target-dir &&
839+
840+
empty_oid=$(printf "" | git hash-object -w --stdin) &&
841+
symlink_oid=$(printf "%s" "$PWD/target-dir" | git hash-object -w --stdin) &&
842+
attr_oid=$(echo "$dir/z filter=delay" | git hash-object -w --stdin) &&
843+
844+
cat >objs <<-EOF &&
845+
100644 blob $empty_oid $dir/x
846+
100644 blob $empty_oid $dir/y
847+
100644 blob $empty_oid $dir/z
848+
120000 blob $symlink_oid $symlink
849+
100644 blob $attr_oid .gitattributes
850+
EOF
851+
852+
git update-index --index-info <objs &&
853+
git commit -m "test commit"
854+
) &&
855+
856+
git clone $mode-collision $mode-collision-cloned &&
857+
# Make sure z was really delayed
858+
grep "IN: smudge $dir/z .* \\[DELAYED\\]" $mode-collision-cloned/delayed.log &&
859+
860+
# Should not create $dir/z at $symlink/z
861+
test_path_is_missing $mode-collision/target-dir/z
862+
'
863+
done
864+
820865
test_done

t/t0021/rot13-filter.pl

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,15 @@
22
# Example implementation for the Git filter protocol version 2
33
# See Documentation/gitattributes.txt, section "Filter Protocol"
44
#
5-
# The first argument defines a debug log file that the script write to.
6-
# All remaining arguments define a list of supported protocol
7-
# capabilities ("clean", "smudge", etc).
5+
# Usage: rot13-filter.pl [--always-delay] <log path> <capabilities>
6+
#
7+
# Log path defines a debug log file that the script writes to. The
8+
# subsequent arguments define a list of supported protocol capabilities
9+
# ("clean", "smudge", etc).
10+
#
11+
# When --always-delay is given all pathnames with the "can-delay" flag
12+
# that don't appear on the list bellow are delayed with a count of 1
13+
# (see more below).
814
#
915
# This implementation supports special test cases:
1016
# (1) If data with the pathname "clean-write-fail.r" is processed with
@@ -53,6 +59,13 @@ sub gitperllib {
5359
use Git::Packet;
5460

5561
my $MAX_PACKET_CONTENT_SIZE = 65516;
62+
63+
my $always_delay = 0;
64+
if ( $ARGV[0] eq '--always-delay' ) {
65+
$always_delay = 1;
66+
shift @ARGV;
67+
}
68+
5669
my $log_file = shift @ARGV;
5770
my @capabilities = @ARGV;
5871

@@ -134,6 +147,8 @@ sub rot13 {
134147
if ( $buffer eq "can-delay=1" ) {
135148
if ( exists $DELAY{$pathname} and $DELAY{$pathname}{"requested"} == 0 ) {
136149
$DELAY{$pathname}{"requested"} = 1;
150+
} elsif ( !exists $DELAY{$pathname} and $always_delay ) {
151+
$DELAY{$pathname} = { "requested" => 1, "count" => 1 };
137152
}
138153
} else {
139154
die "Unknown message '$buffer'";

t/t2006-checkout-index-basic.sh

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,4 +21,50 @@ test_expect_success 'checkout-index -h in broken repository' '
2121
test_i18ngrep "[Uu]sage" broken/usage
2222
'
2323

24+
for mode in 'case' 'utf-8'
25+
do
26+
case "$mode" in
27+
case) dir='A' symlink='a' mode_prereq='CASE_INSENSITIVE_FS' ;;
28+
utf-8)
29+
dir=$(printf "\141\314\210") symlink=$(printf "\303\244")
30+
mode_prereq='UTF8_NFD_TO_NFC' ;;
31+
esac
32+
33+
test_expect_success SYMLINKS,$mode_prereq \
34+
"checkout-index with $mode-collision don't write to the wrong place" '
35+
git init $mode-collision &&
36+
(
37+
cd $mode-collision &&
38+
mkdir target-dir &&
39+
40+
empty_obj_hex=$(git hash-object -w --stdin </dev/null) &&
41+
symlink_hex=$(printf "%s" "$PWD/target-dir" | git hash-object -w --stdin) &&
42+
43+
cat >objs <<-EOF &&
44+
100644 blob ${empty_obj_hex} ${dir}/x
45+
100644 blob ${empty_obj_hex} ${dir}/y
46+
100644 blob ${empty_obj_hex} ${dir}/z
47+
120000 blob ${symlink_hex} ${symlink}
48+
EOF
49+
50+
git update-index --index-info <objs &&
51+
52+
# Note: the order is important here to exercise the
53+
# case where the file at ${dir} has its type changed by
54+
# the time Git tries to check out ${dir}/z.
55+
#
56+
# Also, we use core.precomposeUnicode=false because we
57+
# want Git to treat the UTF-8 paths transparently on
58+
# Mac OS, matching what is in the index.
59+
#
60+
git -c core.precomposeUnicode=false checkout-index -f \
61+
${dir}/x ${dir}/y ${symlink} ${dir}/z &&
62+
63+
# Should not create ${dir}/z at ${symlink}/z
64+
test_path_is_missing target-dir/z
65+
66+
)
67+
'
68+
done
69+
2470
test_done

0 commit comments

Comments
 (0)