Skip to content

Commit 99bd27e

Browse files
author
Junio C Hamano
committed
Update TOpic script to show how old they are.
Signed-off-by: Junio C Hamano <junkio@cox.net>
1 parent 8cbf8ea commit 99bd27e

File tree

3 files changed

+199
-6
lines changed

3 files changed

+199
-6
lines changed

ClonePlus.txt

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
From: Junio C Hamano <junkio@cox.net>
2+
Subject: Re: Make "git clone" less of a deathly quiet experience
3+
Date: Sun, 12 Feb 2006 19:36:41 -0800
4+
Message-ID: <7v4q3453qu.fsf@assigned-by-dhcp.cox.net>
5+
References: <Pine.LNX.4.64.0602102018250.3691@g5.osdl.org>
6+
<7vwtg2o37c.fsf@assigned-by-dhcp.cox.net>
7+
<Pine.LNX.4.64.0602110943170.3691@g5.osdl.org>
8+
<1139685031.4183.31.camel@evo.keithp.com> <43EEAEF3.7040202@op5.se>
9+
<1139717510.4183.34.camel@evo.keithp.com>
10+
<46a038f90602121806jfcaac41tb98b8b4cd4c07c23@mail.gmail.com>
11+
Content-Type: text/plain; charset=us-ascii
12+
Cc: Keith Packard <keithp@keithp.com>, Andreas Ericsson <ae@op5.se>,
13+
Linus Torvalds <torvalds@osdl.org>,
14+
Git Mailing List <git@vger.kernel.org>,
15+
Petr Baudis <pasky@suse.cz>
16+
Return-path: <git-owner@vger.kernel.org>
17+
In-Reply-To: <46a038f90602121806jfcaac41tb98b8b4cd4c07c23@mail.gmail.com>
18+
(Martin Langhoff's message of "Mon, 13 Feb 2006 15:06:42 +1300")
19+
20+
Martin Langhoff <martin.langhoff@gmail.com> writes:
21+
22+
> +1... there should be an easy-to-compute threshold trigger to say --
23+
> hey, let's quit being smart and send this client the packs we got and
24+
> get it over with. Or perhaps a client flag so large projects can
25+
> recommend that uses do their initial clone with --gimme-all-packs?
26+
27+
What upload-pack does boils down to:
28+
29+
* find out the latest of what client has and what client asked.
30+
31+
* run "rev-list --objects ^client ours" to make a list of
32+
objects client needs. The actual command line has multiple
33+
"clients" to exclude what is unneeded to be sent, and
34+
multiple "ours" to include refs asked. When you are doing
35+
a full clone, ^client is empty and ours is essentially
36+
--all.
37+
38+
* feed that output to "pack-objects --stdout" and send out
39+
the result.
40+
41+
If you run this command:
42+
43+
$ git-rev-list --objects --all |
44+
git-pack-objects --stdout >/dev/null
45+
46+
It would say some things. The phases of operations are:
47+
48+
Generating pack...
49+
Counting objects XXXX...
50+
Done counting XXXX objects.
51+
Packing XXXXX objects.....
52+
53+
Phase (1). Between the time it says "Generating pack..." upto
54+
"Done counting XXXX objects.", the time is spent by rev-list to
55+
list up all the objects to be sent out.
56+
57+
Phase (2). After that, it tries to make decision what object to
58+
delta against what other object, while twenty or so dots are
59+
printed after "Packing XXXXX objects." (see #git irc log a
60+
couple of days ago; Linus describes how pack building works).
61+
62+
Phase (3). After the dot stops, the program becomes silent.
63+
That is where it actually does delta compression and writeout.
64+
65+
You would notice that quite a lot of time is spent in all
66+
phases.
67+
68+
There is an internal hook to create full repository pack inside
69+
upload-pack (which is what runs on the other end when you run
70+
fetch-pack or clone-pack), but it works slightly differently
71+
from what you are suggesting, in that it still tries to do the
72+
"correct" thing. It still runs "rev-list --objects --all", so
73+
"dangling objects" are never sent out.
74+
75+
We could cheat in all phases to speed things up, at the expense
76+
of ending up sending excess objects. So let's pretend we
77+
decided to treat everything in .git/objects/packs/pack-* (and
78+
the ones found in alternates as well) have interesting objects
79+
for the cloner.
80+
81+
(1) This part unfortunately cannot be totally eliminated. By
82+
assume all packs are interesting, we could use the object
83+
names from the pack index, which is a lot cheaper than
84+
rev-list object traversal. We still need to run rev-list
85+
--objects --all --unpacked to pick up loose objects we would
86+
not be able to tell by looking at the pack index to cover
87+
the rest.
88+
89+
This however needs to be done in conjunction with the second
90+
phase change. pack-objects depends on the hint rev-list
91+
--objects output gives it to group the blobs and trees with
92+
the same pathnames together, and that greatly affects the
93+
packing efficiency. Unfortunately pack index does not have
94+
that information -- it does not know type, nor pathnames.
95+
Type is relatively cheap to obtain but pathnames for blob
96+
objects are inherently unavailable.
97+
98+
(2) This part can be mostly eliminated for already packed
99+
objects, because we have already decided to cheat by sending
100+
everything, so we can just reuse how objects are deltified
101+
in existing packs. It still needs to be done for loose
102+
objects we collected to fill the gap in (1).
103+
104+
(3) This also can be sped up by reusing what are already in
105+
packs. Pack index records starting (but not end) offset of
106+
each object in the pack, so we can sort by offset to find
107+
out which part of the existing pack corresponds to what
108+
object, to reorder the objects in the final pack. This
109+
needs to be done somewhat carefully to preserve the locality
110+
of objects (again, see #git log). The deltifying and
111+
compressing for loose objects cannot be avoided.
112+
113+
While we are writing things out in (3), we need to keep
114+
track of running SHA1 sum of what we write out so that we
115+
can fill out the correct checksum at the end, but I am
116+
guessing that is relatively cheap compared to the
117+
deltification and compression cost we are currently paying
118+
in this phase.
119+
120+
NB. In the #git log, Linus made it sound like I am clueless
121+
about how pack is generated, but if you check commit 9d5ab96,
122+
the "recency of delta is inherited from base", one of the tricks
123+
that have a big performance impact, was done by me ;-).
124+
125+

ResettingPaths.txt

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
From: Junio C Hamano <junkio@cox.net>
2+
Subject: Resetting paths
3+
Date: Thu, 09 Feb 2006 20:40:15 -0800
4+
Message-ID: <7vlkwjzv0w.fsf@assigned-by-dhcp.cox.net>
5+
Content-Type: text/plain; charset=us-ascii
6+
Return-path: <git-owner@vger.kernel.org>
7+
8+
While working on "assume unchanged" git series, I found one
9+
thing missing from the current set of tools.
10+
11+
While I worked on parts of the system that deals with the cached
12+
lstat() information, I needed a way to debug that, so I hacked
13+
ls-files -t option to show entries marked as "always matches the
14+
index" with lowercase tag letters. This was primarily debugging
15+
aid hack.
16+
17+
Then I committed the whole thing with "git commit -a" by
18+
mistake. In order to rewind the HEAD to pre-commit state, I can
19+
say "git reset --soft HEAD^", but after doing that, now I want
20+
to unupdate the index so that ls-files.c matches the pre-commit
21+
HEAD.
22+
23+
"git reset --mixed" is a heavy-handed tool for that. It reads
24+
the entier index from the HEAD commit without touching the
25+
working tree, so I would need to add the modified paths back
26+
with "git update-index".
27+
28+
The low-level voodoo to do so for this particular case is this
29+
single liner:
30+
31+
git ls-tree HEAD ls-files.c | git update-index --index-info
32+
33+
Have people found themselves in similar need like this? This
34+
could take different forms.
35+
36+
* you did "git update-index" on a wrong path. This is my
37+
example and the above voodoo is a recipe for recovery.
38+
39+
* you did "git add" on a wrong path and you want to remove it.
40+
This is easier than the above:
41+
42+
git update-index --force-remove path
43+
44+
* you did the above recovery from "git add" on a wrong path,
45+
and you want to add it again. The same voodoo would work in
46+
this case as well.
47+
48+
git ls-tree HEAD path | git update-index --index-info
49+
50+
We could add "git reset path..." to reduce typing for the above,
51+
but I am wondering if it is worth it.
52+
53+
BTW, this shows how "index centric" git is. With other SCM that
54+
has only the last commit and the working tree files, you do not
55+
have to worry any of these things, so it might appear that index
56+
is just a nuisance. But if you do not have any "registry of
57+
paths to be committed", you cannot do a partial commit like what
58+
I did above ("commit changes to all files other than
59+
ls-files.c") without listing all the paths to be committed, or
60+
fall back on CVS style "one path at a time", breaking an atomic
61+
commit, so there is a drawback for not having an index as well.
62+
63+
64+

TO

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ sed -n \
3737
-e '/^[^\/][^\/]\//p' |
3838
while read topic
3939
do
40-
rebase= done= not_done= trouble=
40+
rebase= done= not_done= trouble= date=
4141

4242
# (1)
4343
only_next_1=`git-rev-list ^master "^$topic" ${next} | sort`
@@ -55,23 +55,27 @@ do
5555

5656
# (2)
5757
not_in_master=`
58-
git-rev-list --pretty=oneline ^master "$topic" |
59-
sed -e 's/^[0-9a-f]* //'
58+
git-rev-list ^master "$topic"
6059
`
6160
test -z "$not_in_master" &&
6261
done="${LF}Fully merged -- delete."
6362

6463
# (3)
6564
not_in_next=`
66-
git-rev-list --pretty=oneline ^${next} "$topic" |
67-
sed -e 's/^[0-9a-f]* / - /'
65+
git-rev-list --pretty=oneline ^${next} "$topic"
6866
`
6967
if test -n "$not_in_next"
7068
then
7169
if test -n "$done"
7270
then
7371
trouble="${LF}### MODIFIED AFTER COOKED ###"
7472
fi
73+
last=`expr "$not_in_next" : '\([0-9a-f]*\) '`
74+
date=`
75+
git-rev-list -1 --pretty "$last" |
76+
sed -ne 's/^Date: *\(.*\)/ (\1)/p'
77+
`
78+
not_in_next=`echo "$not_in_next" | sed -e 's/^[0-9a-f]* / - /'`
7579
not_done="${LF}Still not merged in ${next}$rebase.$LF$not_in_next"
7680
elif test -n "$done"
7781
then
@@ -80,7 +84,7 @@ do
8084
not_done="${LF}Up to date."
8185
fi
8286

83-
echo "*** $topic ***$trouble$done$not_done"
87+
echo "*** $topic ***$date$trouble$done$not_done"
8488

8589
if test -z "$trouble$not_done" &&
8690
test -n "$done" &&

0 commit comments

Comments
 (0)