How to clone a git repository that got too big?

I am working with a git repo that is very large (> 10gb). The repo itself has many large binary files with many versions of each (> 100 MB). The reasons for this are not beyond the scope of this issue.

Currently, it is no longer possible to correctly clone from the repo, since the server itself does not have enough memory (it has 12 GB) and sends a failure code. I would insert it here, but it takes more than an hour to get to failure.

Are there any methods with which I can make a clone successful? Even the one that grabs a partial copy of the repo? Or a way that I can clone bite-sized chunks that won't make the server suffocate?

+7
git
source share
5 answers

Use rsync to copy the entire repo by pointing it to the top-level directory containing .git . Then change the remotes in .git / config to return to the original.

This is the only key to the top of my head that needs to be changed in .git/config , but I would look at all the other host-specific ones. Most of them are pretty clear.

+4
source share

One answer to the question "How to clone a git repository that has become too large?" "Reduce size by removing Large Blobs."

(I have to go on to clarify that the repo fix is β€‹β€‹β€œbeyond the scope of this question,” however the comment also says, β€œI’m working for a quick fix so I can clone the repo right now,” so I am sending this answer because that: a) it is possible that they do not know about BFG and therefore overestimate the complexity of cleaning the repo, and b) it is really very fast.

To clear the repo easily and quickly, use BFG :

 $ java -jar bfg.jar --strip-blobs-bigger-than 100M my-repo.git 

Any old files larger than 100 MB (which are not included in your last commit) will be deleted from the git repository history. Then you can use git gc to delete dead data:

 $ git gc --prune=now --aggressive 

Once this is done, your repo will be much smaller and should be cloned without any problems.

Full disclosure: I am the author of BFG Repo-Cleaner.

+8
source share

You can try passing the --depth option to git clone . Or can you copy it using rsync or some such?

+5
source share

If you have physical or shell access on the server, you can transfer the repo manually through an external hard drive or FTP. If the repo is bare, see How to Convert a Bare Git Repository to a Normal In Place .

+1
source share

Try reconfiguring package creation options on the serving repo, especially git ~ no limit ~ by default for pack.windowmemory .

I would start with

 git config pack.windowmemory 1g 

because by default it will use so much for the kernel.

+1
source share

All Articles