Delete old .Svn files from the repository

My .svn repository is getting quite large (5 GB) and we really don't need to go back to it until now. (Found 6 months or a year).

I also have a folder with 8 GB .svn in the root of the directory that is unloaded from the repository.

I would even agree to "get started" and keep a copy of the old SVN for 6 months or a year, and then ultimately delete it for How to backup and restore all the source code in svn?

+1
source share
4 answers

What does your .svn repository .svn ?

This .svn folder is mainly used to manage the verified version and has absolutely nothing to do with the history of your repository server.

The .svn directory contains information about which files on the client have changed, who made the check and URL. In versions of Subversion prior to version 1.7, it even saved a full copy of the extracted directory. This way you can make diff to see the changes you made without talking to the server. This means that if you checked 100Mb files, your .svn directory .svn also contain about 100 MB.

If you are talking about a client, you only need to check the part of the URL that you need to work on. For example, let's say you have a standard Subversion repository setting, for example:

  • http://%REPO_URL%/trunk
  • http://%REPO_URL%/tags
  • http://%REPO_URL%/branches

In the trunk section you have all your projects:

  • http://%REPO_URL%/trunk/project_foo
  • http://%REPO_URL%/trunk/project_bar
  • http://%REPO_URL%/trunk/project_fubar

I do not need to check http://%REPO_URL%/trunk if I only work in project_foo . Of course, I do not want to check the http://%REPO_URL% , which will give me my entire repository, including all branches and tags that have been fully written out. (And I saw people who did this).

The Subversion client does not check the entire repository, but only one version of the project. If you check what you need, you can have a repository that is hundreds of terabytes in size, but a working copy probably does not exceed the size of a gigabyte.

One of the problems I've seen is people checking binary code - either third-party libraries or compiled code. This code should not be part of your repository. If you use Java, use Maven, Gradle, or Ant with Ivy to manage these third-party libraries and your own built-in objects that your project can use. If you are using .NET, use NuGet to do the same.

Subversion stores files in diff format. If one version differs from the other in one line, only this line change is saved in Subversion. Although this single source change can be a single line, it can have serious consequences in the embedded file. For binary files, it is often necessary to occupy more than 90% of the Subversion repository space. That is, a repository of about 500 megabytes in size will increase to 50 gigabytes due to binary files.

Worse, binaries quickly become obsolete, and Subversion has no easy way to remove an obsolete version. In addition, there are no tools in Subversion that can help you parse your binaries. The difference between the two binary versions does not make sense. The author does not matter, except for those who created and checked in the version - not necessarily the person who should be contacted for any questions (which is a good way to say blame).

Hope this answers your question. Checkout is just what you need, and your .svn directory will be much smaller. Do not store binaries in Subversion, and your .svn directory .svn not reference them. If this does not help, check out the rare checks that may fix the tracking files that you do not need.

+1
source

One option is to use the svnadmin tool dump command (as shown in your link), but give it the initial version of the point at which you are ready to disconnect the data. This will cause the initial revision to be reset as if it were adding a new tree (i.e., all files in full at the time of the revision). This gives you an account of the last X months of changes made. You can use the --deltas option to reduce the dump file size. See http://svnbook.red-bean.com/en/1.7/svn.ref.svnadmin.c.dump.html .

Then you can create a new repository and transfer this dump file to it using the load command in order to have a new repository with the latest data that you want.

Personally, I do not recommend this, because you never know when old data can come in handy, but I don’t know your specific situation, and this is one way to achieve what I think you are asking for.

+1
source

It seems like you are confusing your working copy with the repository , so it is not clear what exactly you are asking.

If you are using Subversion 1.7 or a newer working copy, it should contain only one .svn directory in the root directory. .svn is an administrative directory and cannot be touched manually. In fact, it does not contain a complete history of changes, as you seem to expect. Quote from SVNBook:

The files in the administration directory help Subversion recognize which of your version files contain unpublished changes and which files are outdated in relation to other users.

I assume that the fact that the .svn directory is 8 GB means that you checked the entire repository. Is not it? And do you really need a working copy of the entire repository? Usually you should check only that part or branch of the project stored in the repository, and such a working copy will be much smaller. @David gives a great summary of this in his answer.

0
source

If you just want to start over, I would do it like this:

  • Check the hint without .svn files:

     $ svn export file:///path/to/current/repository old-trunk 
  • Drop any of this check that you do not want to be in the new repository. As others commented, you probably have a lot of big binaries in the repo that actually don't belong there.

    You can find my pigs script useful for this hunt:

      #!/bin/sh du -skL " $@ " -- * | sort -n 
  • Create a new repo from this clean chat check:

     $ svnadmin create /path/to/new/clean/repository $ svn import old-trunk file:///path/to/new/clean/repository \ -m "Tip of old repo trunk as of 2015.04.14, r12345" 
  • Temporarily push back your old checks, and then do fresh checks from the new clean repository. Keep old checks until you are sure that you have what you need. Even if you keep the old repository, it's good to have at least one known working check as well.

0
source

All Articles