From: 16657101987@163.com
To: unlisted-recipients:; (no To-header on input)
Cc: git@vger.kernel.org, gitster@pobox.com, worldhello.net@gmail.com,
Sun Chao <sunchao9@huawei.com>
Subject: [PATCH v1 1/1] pack-refs: pack expired loose refs to packed_refs
Date: Mon, 22 Jul 2019 02:17:39 +0800 [thread overview]
Message-ID: <20190721181739.81110-2-16657101987@163.com> (raw)
In-Reply-To: <20190721181739.81110-1-16657101987@163.com>
From: Sun Chao <sunchao9@huawei.com>
When a packed ref is deleted, the whole packed-refs file is
rewrite and omit the ref that no longer exists. However if
another gc command is running and call `pack-refs --all`
simultaneously, there is a change that a ref just updated
will lost the newly commits.
Through these steps, losting commits of newly updated refs
could be demonstrated:
# step 1: compile git without `USE_NSEC` option
Some kernerl releases does enable it by default while some
does not. And if we compile git without `USE_NSEC`, it
will be easier demonstrated by the following steps.
# step 2: setup a bare repository and clone it as child
git init --bare parent &&
(cd parent && git config core.logallrefupdates true) &&
git clone parent child
# step 3: in one terminal, repack the bare refs repeatedly
cd parent &&
while true; do
git pack-refs --all
done
# step 4: in another terminal, simultaneously update the
# master with update-ref, and create and delete an
# unrelated ref also with update-ref
cd child &&
while true; do
git commit --allow-empty -m foo &&
us=`git rev-parse master` &&
pushd ../parent &&
git fetch ../child/.git master &&
git update-ref refs/heads/newbranch $us &&
git update-ref refs/heads/master $us &&
git update-ref -d refs/heads/newbranch &&
them=`git rev-parse master` &&
if test "$them" != "$us"; then
echo >&2 "lost commit: $us"
exit 1
fi
popd
done
Though we have the packed-refs lock file and loose refs lock
files to avoid updating conflicts, a ref will lost its newly
commits if the situation which is described as racy-git by
`Documentation/technical/racy-git.txt` happens, it comes like
this:
1. Call `pack-refs --all` to pack all the loose refs to
packed-refs, and let say the modify time of the
packed-refs is DATE_M.
2. Call `update-ref` to update a new commit to master while
it is already packed. the old value (let us call it
OID_A) remains in the packed-refs file and write the new
value (let us call it OID_B) to $GIT_DIR/refs/heads/master.
3. Call `update-ref -d` within the same DATE_M from the 1th
step to delete a different ref newbranch which is packed
in the packed-refs file. It check newbranch's oid from
packed-refs file without locking it.
Meanwhile it keeps a snapshot of the packed-refs file in
memory and record the file's timestamp with the snapshot.
The oid of master in the packed-refs's snapshot is OID_A.
4. Redo the 1th step, after `pack-refs --all` finished, the
oid of master in packe-refs file is OID_B, and the loose
refs $GIT_DIR/refs/heads/master is removed. Let's say
the `pack-refs --all` is very quickly done and the new
packed-refs file's modify time is stille DATE_M.
5. 3th step now going on, after checking the newbranch, it
begin to rewrite the packed-refs file, after get the lock
file of packed-ref file, it checks the timestamp of it's
snapshot in memory with the packed-refs file's time,
they are both the same DATE_M, so the snapshot is not
refreshed.
Because the loose ref of master is removed by 4th step,
`update-ref -d` will updates the new packed-ref to disk
which contains master with the oid OID-A. So now the
newly commit OID-B of master is lost.
There are two valid methods to avoid losting commit of ref:
- force `update-ref -d` to update the snapshot before
rewrite packed-refs.
- do not pack a recently updated ref, where *recently*
could be set by *pack.looserefsexpire* option.
I prefer **do not pack a recently updated ref**, here is the
reasons:
1. It could avoid losting the newly commit of a ref which I
described upon.
2. Sometime, the git server will do `pack-refs --all` and
`update-ref` the same time, and the two commands have
chances to trying lock the same ref such as master, if
this happends one command will fail with such error:
**cannot lock ref 'refs/heads/master'**
This could happen if a ref is updated frequently, and
avoid pack the ref which is update recently could avoid
this error most of the time.
Signed-off-by: Sun Chao <sunchao9@huawei.com>
---
builtin/pack-refs.c | 13 ++++++++++++-
refs.c | 4 ++--
refs.h | 2 +-
refs/files-backend.c | 18 +++++++++++++++++-
refs/packed-backend.c | 2 +-
refs/refs-internal.h | 2 +-
t/helper/test-ref-store.c | 2 +-
7 files changed, 35 insertions(+), 8 deletions(-)
diff --git a/builtin/pack-refs.c b/builtin/pack-refs.c
index cfbd5c36c7..7baced5788 100644
--- a/builtin/pack-refs.c
+++ b/builtin/pack-refs.c
@@ -9,16 +9,27 @@ static char const * const pack_refs_usage[] = {
NULL
};
+static const char *pack_loose_refs_expire = "now";
+
int cmd_pack_refs(int argc, const char **argv, const char *prefix)
{
unsigned int flags = PACK_REFS_PRUNE;
struct option opts[] = {
OPT_BIT(0, "all", &flags, N_("pack everything"), PACK_REFS_ALL),
OPT_BIT(0, "prune", &flags, N_("prune loose refs (default)"), PACK_REFS_PRUNE),
+ { OPTION_STRING, 0, "expire", &pack_loose_refs_expire, N_("date"),
+ N_("pack unactive loose refs"),
+ PARSE_OPT_OPTARG, NULL, (intptr_t)pack_loose_refs_expire },
OPT_END(),
};
+ static timestamp_t expire;
+
+ git_config_get_expiry("pack.looserefsexpire", &pack_loose_refs_expire);
git_config(git_default_config, NULL);
if (parse_options(argc, argv, prefix, opts, pack_refs_usage, 0))
usage_with_options(pack_refs_usage, opts);
- return refs_pack_refs(get_main_ref_store(the_repository), flags);
+
+ if (parse_expiry_date(pack_loose_refs_expire, &expire))
+ die(_("failed to parse '%s' value '%s'"), "pack.looserefsexpire", pack_loose_refs_expire);
+ return refs_pack_refs(get_main_ref_store(the_repository), flags, expire);
}
diff --git a/refs.c b/refs.c
index cd297ee4bd..e72f4b05dd 100644
--- a/refs.c
+++ b/refs.c
@@ -1947,9 +1947,9 @@ void base_ref_store_init(struct ref_store *refs,
}
/* backend functions */
-int refs_pack_refs(struct ref_store *refs, unsigned int flags)
+int refs_pack_refs(struct ref_store *refs, unsigned int flags, timestamp_t expire)
{
- return refs->be->pack_refs(refs, flags);
+ return refs->be->pack_refs(refs, flags, expire);
}
int refs_peel_ref(struct ref_store *refs, const char *refname,
diff --git a/refs.h b/refs.h
index 730d05ad91..0270aa9efc 100644
--- a/refs.h
+++ b/refs.h
@@ -378,7 +378,7 @@ void warn_dangling_symrefs(FILE *fp, const char *msg_fmt,
* Write a packed-refs file for the current repository.
* flags: Combination of the above PACK_REFS_* flags.
*/
-int refs_pack_refs(struct ref_store *refs, unsigned int flags);
+int refs_pack_refs(struct ref_store *refs, unsigned int flags, timestamp_t expire);
/*
* Setup reflog before using. Fill in err and return -1 on failure.
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 63e55e6773..7e6676cece 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1144,7 +1144,7 @@ static int should_pack_ref(const char *refname,
return 1;
}
-static int files_pack_refs(struct ref_store *ref_store, unsigned int flags)
+static int files_pack_refs(struct ref_store *ref_store, unsigned int flags, timestamp_t expire)
{
struct files_ref_store *refs =
files_downcast(ref_store, REF_STORE_WRITE | REF_STORE_ODB,
@@ -1152,13 +1152,19 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags)
struct ref_iterator *iter;
int ok;
struct ref_to_prune *refs_to_prune = NULL;
+ struct stat st;
+ struct strbuf path = STRBUF_INIT;
struct strbuf err = STRBUF_INIT;
struct ref_transaction *transaction;
+ size_t path_baselen;
transaction = ref_store_transaction_begin(refs->packed_ref_store, &err);
if (!transaction)
return -1;
+ files_ref_path(refs, &path, "");
+ path_baselen = path.len;
+
packed_refs_lock(refs->packed_ref_store, LOCK_DIE_ON_ERROR, &err);
iter = cache_ref_iterator_begin(get_loose_ref_cache(refs), NULL, 0);
@@ -1172,6 +1178,16 @@ static int files_pack_refs(struct ref_store *ref_store, unsigned int flags)
flags))
continue;
+ /*
+ * If the loose reference is active (not expired), do not pack it.
+ */
+ strbuf_setlen(&path, path_baselen);
+ strbuf_addstr(&path, iter->refname);
+ if (stat(path.buf, &st) == 0) {
+ if (st.st_mtime > expire)
+ continue;
+ }
+
/*
* Add a reference creation for this reference to the
* packed-refs transaction:
diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index c01c7f5901..2c6e6ee990 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -1550,7 +1550,7 @@ static int packed_delete_refs(struct ref_store *ref_store, const char *msg,
return ret;
}
-static int packed_pack_refs(struct ref_store *ref_store, unsigned int flags)
+static int packed_pack_refs(struct ref_store *ref_store, unsigned int flags, timestamp_t expire)
{
/*
* Packed refs are already packed. It might be that loose refs
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index f2d8c0123a..81f00e4c04 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -530,7 +530,7 @@ typedef int ref_transaction_commit_fn(struct ref_store *refs,
struct ref_transaction *transaction,
struct strbuf *err);
-typedef int pack_refs_fn(struct ref_store *ref_store, unsigned int flags);
+typedef int pack_refs_fn(struct ref_store *ref_store, unsigned int flags, timestamp_t expire);
typedef int create_symref_fn(struct ref_store *ref_store,
const char *ref_target,
const char *refs_heads_master,
diff --git a/t/helper/test-ref-store.c b/t/helper/test-ref-store.c
index 799fc00aa1..b2be2e311d 100644
--- a/t/helper/test-ref-store.c
+++ b/t/helper/test-ref-store.c
@@ -69,7 +69,7 @@ static int cmd_pack_refs(struct ref_store *refs, const char **argv)
{
unsigned int flags = arg_flags(*argv++, "flags");
- return refs_pack_refs(refs, flags);
+ return refs_pack_refs(refs, flags, TIME_MAX);
}
static int cmd_peel_ref(struct ref_store *refs, const char **argv)
--
2.22.0.214.g8dca754b1e
next prev parent reply other threads:[~2019-07-21 18:18 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-21 18:17 [PATCH v1 0/1] pack-refs: pack expired loose refs to packed_refs 16657101987
2019-07-21 18:17 ` 16657101987 [this message]
2019-07-30 6:36 ` [PATCH v1 1/1] " Jeff King
2019-07-31 18:35 ` [PATCH v2 0/1] pack-refs: always refreshing after take the lock file 16657101987
2019-07-31 18:35 ` [PATCH v2 1/1] " 16657101987
2019-08-16 20:49 ` [PATCH v2 0/1] " Jeff King
2019-08-19 17:36 ` Junio C Hamano
2019-08-20 15:14 ` 16657101987
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190721181739.81110-2-16657101987@163.com \
--to=16657101987@163.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=sunchao9@huawei.com \
--cc=worldhello.net@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).