Skip to content

Commit 960873f

Browse files
committed
Merge branch 'en/present-despite-skipped' into next
In sparse-checkouts, files mis-marked as missing from the working tree could lead to later problems. Such files were hard to discover, and harder to correct. Automatically detecting and correcting the marking of such files has been added to avoid these problems. * en/present-despite-skipped: Accelerate clear_skip_worktree_from_present_files() by caching Update documentation related to sparsity and the skip-worktree bit repo_read_index: clear SKIP_WORKTREE bit from files present in worktree unpack-trees: fix accidental loss of user changes t1011: add testcase demonstrating accidental loss of user modifications
2 parents 97ac92e + d79d299 commit 960873f

13 files changed

+246
-128
lines changed

Documentation/git-read-tree.txt

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -375,17 +375,23 @@ have finished your work-in-progress), attempt the merge again.
375375
SPARSE CHECKOUT
376376
---------------
377377

378+
Note: The `update-index` and `read-tree` primitives for supporting the
379+
skip-worktree bit predated the introduction of
380+
linkgit:git-sparse-checkout[1]. Users are encouraged to use
381+
`sparse-checkout` in preference to these low-level primitives.
382+
378383
"Sparse checkout" allows populating the working directory sparsely.
379-
It uses the skip-worktree bit (see linkgit:git-update-index[1]) to tell
380-
Git whether a file in the working directory is worth looking at.
384+
It uses the skip-worktree bit (see linkgit:git-update-index[1]) to
385+
tell Git whether a file in the working directory is worth looking at.
381386

382387
'git read-tree' and other merge-based commands ('git merge', 'git
383388
checkout'...) can help maintaining the skip-worktree bitmap and working
384389
directory update. `$GIT_DIR/info/sparse-checkout` is used to
385390
define the skip-worktree reference bitmap. When 'git read-tree' needs
386391
to update the working directory, it resets the skip-worktree bit in the index
387392
based on this file, which uses the same syntax as .gitignore files.
388-
If an entry matches a pattern in this file, skip-worktree will not be
393+
If an entry matches a pattern in this file, or the entry corresponds to
394+
a file present in the working tree, then skip-worktree will not be
389395
set on that entry. Otherwise, skip-worktree will be set.
390396

391397
Then it compares the new skip-worktree value with the previous one. If

Documentation/git-sparse-checkout.txt

Lines changed: 46 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,7 @@ git-sparse-checkout(1)
33

44
NAME
55
----
6-
git-sparse-checkout - Initialize and modify the sparse-checkout
7-
configuration, which reduces the checkout to a set of paths
8-
given by a list of patterns.
6+
git-sparse-checkout - Reduce your working tree to a subset of tracked files
97

108

119
SYNOPSIS
@@ -17,8 +15,20 @@ SYNOPSIS
1715
DESCRIPTION
1816
-----------
1917

20-
Initialize and modify the sparse-checkout configuration, which reduces
21-
the checkout to a set of paths given by a list of patterns.
18+
This command is used to create sparse checkouts, which means that it
19+
changes the working tree from having all tracked files present, to only
20+
have a subset of them. It can also switch which subset of files are
21+
present, or undo and go back to having all tracked files present in the
22+
working copy.
23+
24+
The subset of files is chosen by providing a list of directories in
25+
cone mode (which is recommended), or by providing a list of patterns
26+
in non-cone mode.
27+
28+
When in a sparse-checkout, other Git commands behave a bit differently.
29+
For example, switching branches will not update paths outside the
30+
sparse-checkout directories/patterns, and `git commit -a` will not record
31+
paths outside the sparse-checkout directories/patterns as deleted.
2232

2333
THIS COMMAND IS EXPERIMENTAL. ITS BEHAVIOR, AND THE BEHAVIOR OF OTHER
2434
COMMANDS IN THE PRESENCE OF SPARSE-CHECKOUTS, WILL LIKELY CHANGE IN
@@ -28,7 +38,7 @@ THE FUTURE.
2838
COMMANDS
2939
--------
3040
'list'::
31-
Describe the patterns in the sparse-checkout file.
41+
Describe the directories or patterns in the sparse-checkout file.
3242

3343
'set'::
3444
Enable the necessary sparse-checkout config settings
@@ -46,20 +56,26 @@ the 'set' subcommand are stored in the worktree-specific sparse-checkout
4656
file. See linkgit:git-worktree[1] and the documentation of
4757
`extensions.worktreeConfig` in linkgit:git-config[1] for more details.
4858
+
49-
When the `--stdin` option is provided, the patterns are read from
50-
standard in as a newline-delimited list instead of from the arguments.
59+
When the `--stdin` option is provided, the directories or patterns are
60+
read from standard in as a newline-delimited list instead of from the
61+
arguments.
5162
+
5263
When `--cone` is passed or `core.sparseCheckoutCone` is enabled, the
53-
input list is considered a list of directories instead of
54-
sparse-checkout patterns. This allows for better performance with a
55-
limited set of patterns (see 'CONE PATTERN SET' below). Note that the
56-
set command will write patterns to the sparse-checkout file to include
57-
all files contained in those directories (recursively) as well as
58-
files that are siblings of ancestor directories. The input format
59-
matches the output of `git ls-tree --name-only`. This includes
60-
interpreting pathnames that begin with a double quote (") as C-style
61-
quoted strings. This may become the default in the future; --no-cone
62-
can be passed to request non-cone mode.
64+
input list is considered a list of directories. This allows for
65+
better performance with a limited set of patterns (see 'CONE PATTERN
66+
SET' below). The input format matches the output of `git ls-tree
67+
--name-only`. This includes interpreting pathnames that begin with a
68+
double quote (") as C-style quoted strings. Note that the set command
69+
will write patterns to the sparse-checkout file to include all files
70+
contained in those directories (recursively) as well as files that are
71+
siblings of ancestor directories. This may become the default in the
72+
future; --no-cone can be passed to request non-cone mode.
73+
+
74+
When `--no-cone` is passed or `core.sparseCheckoutCone` is not enabled,
75+
the input list is considered a list of patterns. This mode is harder
76+
to use and less performant, and is thus not recommended. See the
77+
"Sparse Checkout" section of linkgit:git-read-tree[1] and the "Pattern
78+
Set" sections below for more details.
6379
+
6480
Use the `--[no-]sparse-index` option to use a sparse index (the
6581
default is to not use it). A sparse index reduces the size of the
@@ -77,11 +93,10 @@ understand the sparse directory entries index extension and may fail to
7793
interact with your repository until it is disabled.
7894

7995
'add'::
80-
Update the sparse-checkout file to include additional patterns.
81-
By default, these patterns are read from the command-line arguments,
82-
but they can be read from stdin using the `--stdin` option. When
83-
`core.sparseCheckoutCone` is enabled, the given patterns are interpreted
84-
as directory names as in the 'set' subcommand.
96+
Update the sparse-checkout file to include additional directories
97+
(in cone mode) or patterns (in non-cone mode). By default, these
98+
directories or patterns are read from the command-line arguments,
99+
but they can be read from stdin using the `--stdin` option.
85100

86101
'reapply'::
87102
Reapply the sparsity pattern rules to paths in the working tree.
@@ -125,13 +140,14 @@ decreased in utility.
125140
SPARSE CHECKOUT
126141
---------------
127142

128-
"Sparse checkout" allows populating the working directory sparsely.
129-
It uses the skip-worktree bit (see linkgit:git-update-index[1]) to tell
130-
Git whether a file in the working directory is worth looking at. If
131-
the skip-worktree bit is set, then the file is ignored in the working
132-
directory. Git will avoid populating the contents of those files, which
133-
makes a sparse checkout helpful when working in a repository with many
134-
files, but only a few are important to the current user.
143+
"Sparse checkout" allows populating the working directory sparsely. It
144+
uses the skip-worktree bit (see linkgit:git-update-index[1]) to tell Git
145+
whether a file in the working directory is worth looking at. If the
146+
skip-worktree bit is set, and the file is not present in the working tree,
147+
then its absence is ignored. Git will avoid populating the contents of
148+
those files, which makes a sparse checkout helpful when working in a
149+
repository with many files, but only a few are important to the current
150+
user.
135151

136152
The `$GIT_DIR/info/sparse-checkout` file is used to define the
137153
skip-worktree reference bitmap. When Git updates the working

Documentation/git-update-index.txt

Lines changed: 43 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -351,6 +351,10 @@ unchanged". Note that "assume unchanged" bit is *not* set if
351351
the index (use `git update-index --really-refresh` if you want
352352
to mark them as "assume unchanged").
353353

354+
Sometimes users confuse the assume-unchanged bit with the
355+
skip-worktree bit. See the final paragraph in the "Skip-worktree bit"
356+
section below for an explanation of the differences.
357+
354358

355359
EXAMPLES
356360
--------
@@ -392,22 +396,47 @@ M foo.c
392396
SKIP-WORKTREE BIT
393397
-----------------
394398

395-
Skip-worktree bit can be defined in one (long) sentence: When reading
396-
an entry, if it is marked as skip-worktree, then Git pretends its
397-
working directory version is up to date and read the index version
398-
instead.
399-
400-
To elaborate, "reading" means checking for file existence, reading
401-
file attributes or file content. The working directory version may be
402-
present or absent. If present, its content may match against the index
403-
version or not. Writing is not affected by this bit, content safety
404-
is still first priority. Note that Git _can_ update working directory
405-
file, that is marked skip-worktree, if it is safe to do so (i.e.
406-
working directory version matches index version)
399+
Skip-worktree bit can be defined in one (long) sentence: Tell git to
400+
avoid writing the file to the working directory when reasonably
401+
possible, and treat the file as unchanged when it is not
402+
present in the working directory.
403+
404+
Note that not all git commands will pay attention to this bit, and
405+
some only partially support it.
406+
407+
The update-index flags and the read-tree capabilities relating to the
408+
skip-worktree bit predated the introduction of the
409+
linkgit:git-sparse-checkout[1] command, which provides a much easier
410+
way to configure and handle the skip-worktree bits. If you want to
411+
reduce your working tree to only deal with a subset of the files in
412+
the repository, we strongly encourage the use of
413+
linkgit:git-sparse-checkout[1] in preference to the low-level
414+
update-index and read-tree primitives.
415+
416+
The primary purpose of the skip-worktree bit is to enable sparse
417+
checkouts, i.e. to have working directories with only a subset of
418+
paths present. When the skip-worktree bit is set, Git commands (such
419+
as `switch`, `pull`, `merge`) will avoid writing these files.
420+
However, these commands will sometimes write these files anyway in
421+
important cases such as conflicts during a merge or rebase. Git
422+
commands will also avoid treating the lack of such files as an
423+
intentional deletion; for example `git add -u` will not not stage a
424+
deletion for these files and `git commit -a` will not make a commit
425+
deleting them either.
407426

408427
Although this bit looks similar to assume-unchanged bit, its goal is
409-
different from assume-unchanged bit's. Skip-worktree also takes
410-
precedence over assume-unchanged bit when both are set.
428+
different. The assume-unchanged bit is for leaving the file in the
429+
working tree but having Git omit checking it for changes and presuming
430+
that the file has not been changed (though if it can determine without
431+
stat'ing the file that it has changed, it is free to record the
432+
changes). skip-worktree tells Git to ignore the absence of the file,
433+
avoid updating it when possible with commands that normally update
434+
much of the working directory (e.g. `checkout`, `switch`, `pull`,
435+
etc.), and not have its absence be recorded in commits. Note that in
436+
sparse checkouts (setup by `git sparse-checkout` or by configuring
437+
core.sparseCheckout to true), if a file is marked as skip-worktree in
438+
the index but is found in the working tree, Git will clear the
439+
skip-worktree bit for that file.
411440

412441
SPLIT INDEX
413442
-----------

repository.c

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -301,6 +301,13 @@ int repo_read_index(struct repository *repo)
301301
if (repo->settings.command_requires_full_index)
302302
ensure_full_index(repo->index);
303303

304+
/*
305+
* If sparse checkouts are in use, check whether paths with the
306+
* SKIP_WORKTREE attribute are missing from the worktree; if not,
307+
* clear that attribute for that path.
308+
*/
309+
clear_skip_worktree_from_present_files(repo->index);
310+
304311
return res;
305312
}
306313

sparse-index.c

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -337,6 +337,79 @@ void ensure_correct_sparsity(struct index_state *istate)
337337
ensure_full_index(istate);
338338
}
339339

340+
static int path_found(const char *path, const char **dirname, size_t *dir_len,
341+
int *dir_found)
342+
{
343+
struct stat st;
344+
char *newdir;
345+
char *tmp;
346+
347+
/*
348+
* If dirname corresponds to a directory that doesn't exist, and this
349+
* path starts with dirname, then path can't exist.
350+
*/
351+
if (!*dir_found && !memcmp(path, *dirname, *dir_len))
352+
return 0;
353+
354+
/*
355+
* If path itself exists, return 1.
356+
*/
357+
if (!lstat(path, &st))
358+
return 1;
359+
360+
/*
361+
* Otherwise, path does not exist so we'll return 0...but we'll first
362+
* determine some info about its parent directory so we can avoid
363+
* lstat calls for future cache entries.
364+
*/
365+
newdir = strrchr(path, '/');
366+
if (!newdir)
367+
return 0; /* Didn't find a parent dir; just return 0 now. */
368+
369+
/*
370+
* If path starts with directory (which we already lstat'ed and found),
371+
* then no need to lstat parent directory again.
372+
*/
373+
if (*dir_found && *dirname && memcmp(path, *dirname, *dir_len))
374+
return 0;
375+
376+
/* Free previous dirname, and cache path's dirname */
377+
*dirname = path;
378+
*dir_len = newdir - path + 1;
379+
380+
tmp = xstrndup(path, *dir_len);
381+
*dir_found = !lstat(tmp, &st);
382+
free(tmp);
383+
384+
return 0;
385+
}
386+
387+
void clear_skip_worktree_from_present_files(struct index_state *istate)
388+
{
389+
const char *last_dirname = NULL;
390+
size_t dir_len = 0;
391+
int dir_found = 1;
392+
393+
int i;
394+
395+
if (!core_apply_sparse_checkout)
396+
return;
397+
398+
restart:
399+
for (i = 0; i < istate->cache_nr; i++) {
400+
struct cache_entry *ce = istate->cache[i];
401+
402+
if (ce_skip_worktree(ce) &&
403+
path_found(ce->name, &last_dirname, &dir_len, &dir_found)) {
404+
if (S_ISSPARSEDIR(ce->ce_mode)) {
405+
ensure_full_index(istate);
406+
goto restart;
407+
}
408+
ce->ce_flags &= ~CE_SKIP_WORKTREE;
409+
}
410+
}
411+
}
412+
340413
/*
341414
* This static global helps avoid infinite recursion between
342415
* expand_to_path() and index_file_exists().

sparse-index.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@ struct index_state;
55
#define SPARSE_INDEX_MEMORY_ONLY (1 << 0)
66
int convert_to_sparse(struct index_state *istate, int flags);
77
void ensure_correct_sparsity(struct index_state *istate);
8+
void clear_skip_worktree_from_present_files(struct index_state *istate);
89

910
/*
1011
* Some places in the codebase expect to search for a specific path.

t/t1011-read-tree-sparse-checkout.sh

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -187,11 +187,32 @@ test_expect_success 'read-tree updates worktree, absent case' '
187187
test ! -f init.t
188188
'
189189

190+
test_expect_success 'read-tree will not throw away dirty changes, non-sparse' '
191+
echo "/*" >.git/info/sparse-checkout &&
192+
read_tree_u_must_succeed -m -u HEAD &&
193+
194+
echo dirty >init.t &&
195+
read_tree_u_must_fail -m -u HEAD^ &&
196+
test_path_is_file init.t &&
197+
grep -q dirty init.t
198+
'
199+
200+
test_expect_success 'read-tree will not throw away dirty changes, sparse' '
201+
echo "/*" >.git/info/sparse-checkout &&
202+
read_tree_u_must_succeed -m -u HEAD &&
203+
204+
echo dirty >init.t &&
205+
echo sub/added >.git/info/sparse-checkout &&
206+
read_tree_u_must_fail -m -u HEAD^ &&
207+
test_path_is_file init.t &&
208+
grep -q dirty init.t
209+
'
210+
190211
test_expect_success 'read-tree updates worktree, dirty case' '
191212
echo sub/added >.git/info/sparse-checkout &&
192213
git checkout -f top &&
193214
echo dirty >init.t &&
194-
read_tree_u_must_succeed -m -u HEAD^ &&
215+
read_tree_u_must_fail -m -u HEAD^ &&
195216
grep -q dirty init.t &&
196217
rm init.t
197218
'

0 commit comments

Comments
 (0)