What is the right way to backup ZODB drops?

I use plone.app.blob to store large ZODB objects in the blobstorage directory. This reduces pressure on Data.fs, but I could not find any recommendations for backing up this data.

I am already backing up Data.fs by specifying a network backup tool in the repozo backup directory. Should I just specify this tool in the blobstorage directory to backup my drops?

What if the database is repackaged or drops are added and deleted during copying? Are there files in the blobstorage directory that need to be copied in a specific order?

+6
python zope zodb plone
source share
4 answers

Backing up blobstorage will do this. There is no need for a special order or anything else, it is very simple.

All operations in Plone are fully transactional, so getting a backup in the middle of the transaction should work fine. This is why you can make real-time backups of ZODB. Not knowing which file system you are working on, I would suggest that it should work as intended.

+2
source share

You need to make a backup copy of reposo Data.fs, followed by the rsync of the blobstorage directory, until the database is packed, while these two operations are performed.

This is due to the fact that, at least when using blocks with FileStorage, changes to the blob always lead to the creation of a new file with a name based on the object identifier and transaction identifier. Therefore, if new or updated drops are recorded after backing up Data.fs, this should not be a problem, since the files referenced by Data.fs should still be nearby. Removing blob does not delete the file until the database is packed, so this should be fine too.

Performing a backup in a different order or with packaging during the backup can result in a backup of Data.fs that refers to drops that are not included in the backup.

+12
source share

I have a script that copies drops for a month using hard links (so you have historical drops in the form of Data.fs):

backup.sh

#!/bin/sh # per a fer un full : ./cron_nocturn.sh full ZEO_FOLDER=/var/plone/ZEO # Zeo port ZEO_PORT = 8023 # Name of the DB ZEO_DB = zodb1 BACKUP_FOLDER=/backup/plone LOGBACKUP=/var/plone/ZEO/backup.log BACKUPDIR=`date +%d` echo "INICI BACKUP" >> $LOGBACKUP echo `date` >> $LOGBACKUP # Fem el packing if [ "$1" = "full" ]; then $ZEO_FOLDER/bin/zeopack -S $ZEO_DB -p $ZEO_PORT -h 127.0.0.1 echo " Comprovant folders" #mirem si existeix el folder de backup if ! [ -x $BACKUP_FOLDER/$ZEO_DB ]; then mkdir $BACKUP_FOLDER/$ZEO_DB fi #mirem si existeix el backup folder del dia if ! [ -x $BACKUP_FOLDER/blobs/$BACKUPDIR/ ] ; then mkdir $BACKUP_FOLDER/blobs/$BACKUPDIR/ fi echo " Backup Data.fs" # backup de Data.fs if [ "$1" = "full" ]; then echo " Copiant Data.fs" $ZEO_FOLDER/bin/repozo -B -F -r $BACKUP_FOLDER/$ZEO_DB/ -f $ZEO_FOLDER/var/filestorage/Data_$ZEO_DB.fs echo " Purgant backups antics" $ZEO_FOLDER/neteja.py -l $BACKUP_FOLDER/$ZEO_DB -k 2 else $ZEO_FOLDER/bin/repozo -B -r $BACKUP_FOLDER/$ZEO_DB/ -f $ZEO_FOLDER/var/filestorage/Data_$ZEO_DB.fs fi echo " Copiant blobs" # backup blobs rm -rf $BACKUP_FOLDER/blobs/$BACKUPDIR cd $BACKUP_FOLDER/current-blobs && find . -print | cpio -dplm $BACKUP_FOLDER/blobs/$BACKUPDIR rsync --force --ignore-errors --delete --update -a $ZEO_FOLDER/var/blobs/ $BACKUP_FOLDER/current-blobs/ echo "FI BACKUP" >> $LOGBACKUP echo `date` >> $LOGBACKUP 

neteja.py

 #!/usr/bin/python2.4 # neteja.py -l [directori_desti] -k [numero_fulls_a_mantenir] # Script que neteja un directori amb backups i guarda nomes els ultims fulls que li especifiquis # Es basa en la utilitzacio de collective.recipe.backup # Author: Victor Fernandez de Alba < sneridagh@gmail.com > import sys, getopt sys.path[0:0] = [ '/var/plone/genwebupcZEO/produccio/eggs/collective.recipe.backup-1.3-py2.4.egg', '/var/plone/genwebupcZEO/produccio/eggs/zc.buildout-1.4.2-py2.4.egg', '/var/plone/genwebupcZEO/produccio/eggs/zc.recipe.egg-1.2.2-py2.4.egg', '/var/plone/genwebupcZEO/produccio/eggs/setuptools-0.6c11-py2.4.egg', ] import collective.recipe.backup.repozorunner argv = sys.argv[1:] try: opts, args = getopt.getopt(argv, "l:k:", ["location=", "keep="]) except getopt.GetoptError: print "neteja.py -l [directori_desti] -k [numero_fulls_a_mantenir]" sys.exit(2) for opt, arg in opts: if opt in ("-l", "--location"): location = arg elif opt in ("-k", "--keep"): keep = arg if len(opts)<2: print "neteja.py -l [directori_desti] -k [numero_fulls_a_mantenir]" sys.exit(2) collective.recipe.backup.repozorunner.cleanup(location, keep) 
+2
source share

The backup strategy for FileStorage is fine. However, backing up any database that stores data in multiple files will never be easy, since your copy should happen without writing to various files. For FileStorage, a blind, stupid copy is perfect, as it is just one file. (Using repozo is even better.)

In this case (with BlobStorage combined with FileStorage), I should point to the usual backup advice:

  • delete db offline when creating a copy of the file system
  • Use snapshot tools such as LVM to freeze the disk at a given point.
  • export transactions (almost impossible)
+1
source share

All Articles