git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Tomas Mudrunka <mudrunka@spoje.net>
To: git@vger.kernel.org
Subject: "Garbage collect" old commits in git repository to free disk space
Date: Tue, 18 Feb 2020 01:29:24 +0100	[thread overview]
Message-ID: <8a4001f7b0e23e7df3172deeb32e0553@spoje.net> (raw)

Hello,
is there safe way to garbage collect old commits from git repository? 
Lets say that i want to always keep only last 100 commits and throw 
everything older away. To achieve similar goal as git clone --depth=100, 
but on the server side. I had partial success with doing shallow clone 
and then converting to bare repo while removing the shallow flag from 
.git/config. But i didn't liked that solution and wasn't really sure 
what consequences in terms of data integrity and forward compatibility 
with newer git versions might be.

To tell you more about my USE CASE:

I want to create free opensource sofware similar to dropbox, but based 
on git. My idea is following:

1.) Automaticaly pull/commit/push changed files to/from several laptops 
to single git server (and forcefully resolve all conflicts, this will 
work unless you plan to use it for software development)
2.) On central server maintain tags indicating latest commits 
synchronized to individual laptops.
3.) On server delete old commits that are no longer needed by any laptop 
to sync their worktree. Once synced, delete these commits on laptops as 
well. (optionaly leaving eg. 1 month or 1GB of old commits in case you 
might need to rollback. Possibly keep the history only on the server, 
while deleting it from clients)

This way computers can stay in sync forever without running out of disk 
space, because old commits are removed.
Eg. If i accidentaly add some very big file to synced folder and then 
delete it, it will eventualy get deleted, once everybody gets in sync 
again.

I am aware that this is not something which git was designed for, but to 
me it seems like it should be more than doable. Do you think, any of you 
can give me some hints on how to approach this problem please?


These are some projects which inspired me to explore this route:

https://github.com/presslabs/gitfs
https://www.syncany.org/
https://www.cis.upenn.edu/~bcpierce/unison/
https://etckeeper.branchable.com/

-- 
S pozdravem
Best regards
      Tomáš Mudruňka - SPOJE.NET s.r.o.

             reply	other threads:[~2020-02-18  0:36 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-18  0:29 Tomas Mudrunka [this message]
2020-02-18  5:51 ` "Garbage collect" old commits in git repository to free disk space Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8a4001f7b0e23e7df3172deeb32e0553@spoje.net \
    --to=mudrunka@spoje.net \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).