FOSSology Backup and Restore Documentation – Only Backup Gold Files Solution

Introduction

This document is designed to help you backup and restore FOSSology system use backup only gold files repository solution. Its intended audience is the system administrator who wants to quickly backup and restore a FOSSology system.

This document is arranged as follows:

Section 1 – Backup Instructions

Follow backup instructions, will backup database and gold files repository to remote backup server BACKUPDIR, under the BACKUPDIR there will be follow directory tree:

current  day1  day2  day3 ... 

Section 2 – Restore Instructions

Follow restore instructions, you can restore database and gold files repository from remote backup server, you can restore from BACKUPDIR's current or day1 or day2 or day3 ... subdirectory

Section 1 – Backup Instructions

  1. Before you start your FOSSology backup, you should confirm you have already backup the FOSSology, Postgresql, PHP and Apache configuration files in your whole system backup. * FOSSology configuration files include Db.conf, Depth.conf, Hosts.conf, Proxy.conf, RepPath.conf, Scheduler.conf located in /usr/local/etc/fossology if you install from source code, or located in /etc/fossology
    - Install rsync on both FOSSology server and Backup server * Note: suggest use rsync version 3 or later version
    - First of all you need an rsync running FOSSology server and Backup server that connect to each other without being required to enter a password.
    - On FOSSology server as user “fossy” running * ''$ ssh-kengen –t dsa'' To generate a private and public key pair, and the public store in the /srv/fossology/.ssh/id_dsa.pub
    - Add a system user and a system group both named “fossy” on Backup server
    - Copy id_dsa.pub contents to Backup server’s ~fossy/.ssh/authorized_keys file
    - Now you can test from FOSSology server use user “fossy” ssh to Backup server without enter password
    - Create a backup directory on Backup server, group “fossy” should have write permission to this directory *
    # mkdir $BACKUPDIR
    # chown root:fossy $BACKUPDIR
    # chmod g+w $BACKUPDIR

    - Run the shell script called //fo-backup// to implement backup steps, run it as root: *
    #sh fo-backup [options]
    * Before you run the script, first edit the User variables for your environment: * BACKUPDIR=/home/fossy/backup # Directory on the backup server to store backup data, is created on step4 * KEY=/srv/fossology/.ssh/id_dsa # ssh public key about backup user 'fossy' * SOURCEDIR=/srv/fossology/ # Source directory to backup * PGBACKUPDIR=/srv/fossology/db # Directory to store database backup data * BACKUPUSER= # Backup user and hostname/ip * EXCLUDES=(./backup_include_a ./backup_include_b) # Exclude files to exclude or include backup directory and files
    - You can use exclude files to control which repository directory to backup, edit exclude file to include files you want do backup and exclude files you don’t want to backup. *
    + /repository/
    - /*
    + /repository/localhost/
    - /repository/*
    + /repository/localhost/gold/
    - /repository/localhost/*
    * On this solution, the exclude files only include gold file repository
    - You can also use exclude files to control how many thread to use rsync backup. * If 1 exclude file in directory, there will 1 thread run rsync * If 2 exclude file in directory, there will 2 thread run rsync * Follow explain the shell scripts steps: * 5a. User variables *
    # user variables
    BACKUPDIR=/home/fossy/backup               # Directory on the backup server to store backup data
    KEY=/srv/fossology/.ssh/id_dsa             # ssh public key about backup user 'fossy'
    SOURCEDIR=/srv/fossology/                  # Source directory to backup
    PGBACKUPDIR=/srv/fossology/db              # Directory to store database backup data
    BACKUPUSER=fossy@192.168.10.129            # Backup user and hostname/ip
    EXCLUDES=(./backup_include_a ./backup_include_b)     # Exclude files to exclude or include backup directory and files
    # path variables
    CP=/bin/cp;
    MK=/bin/mkdir;
    SSH=/usr/bin/ssh;
    DATE=/bin/date;
    RM=/bin/rm;
    GREP=/bin/grep;
    RSYNC=/usr/bin/rsync;
    TOUCH=/bin/touch;
    
    • 5b. Before stop scheduler, first rsync the repository when FOSSology running When 1st rsync finished, run other steps.
      • su fossy -c "$RSYNC -a --delete --exclude-from=\"$EXCLUDE\" -e \"$SSH -i $KEY\" $SOURCEDIR $BACKUPUSER:/$BACKUPDIR/current" &
        * 5c. Stop Scheduler *
        # /etc/init.d/fossology stop
        * 5d. Backup database on local FOSSolog system $PGBACKUPDIR=/srv/fossology/db directory, call the backup file ''fo_dbbackup.gz''
        - # mkdir $PGBACKUPDIR and have write permission to this directory
        - Run pg_dumpall as user “postgres” *
        su postgres -c 'pg_dumpall' | gzip > $PGBACKUPDIR/fo_dbbackup.gz
        
        * 5e. Backup database file ''fo_dbbackup.gz'' and only gold repository files to Backup Server * Make an incremental rsync of the files of your FOSSology server to the backup server. It will all be stored in the "current" folder, afterwards it will create a hardlink copy to the previously created new "timestamp" folder. Edit exclude file to include files you want do backup and exclude files you don’t want to backup. *
        NOW=`$DATE '+%Y-%m'-%d_%H:%M`
           MKDIR=$BACKUPDIR/$NOW/
           #create new backupdir
           $SSH -i $KEY $BACKUPUSER "$MK $MKDIR" 
           # Run RSYNC
        su fossy -c "$RSYNC -a --delete --exclude-from=\"$EXCLUDE\" -e \"$SSH -i $KEY\" $SOURCEDIR $BACKUPUSER:/$BACKUPDIR/current" &
           # Update the mtime to refelct the snapshot time
           su fossy -c "$SSH -I $KEY $BACKUPUSER \"$TOUCH $BACKUPDIR/current\"" 
           # Make hardlink copy
           su fossy -c "$SSH -i $KEY $BACKUPUSER \"$CP -al $BACKUPDIR/current/* $MKDIR\""
        * 5f. Restart FOSSology scheduler after backup
        - Add shell script to /etc/crontab *
        0 2    * * *   root    fo-backup
        * Run the script every day 2:00am
        - Restore cron job service *
        # /etc/init.d/crond restart

Section 2 – Restore Instructions

  1. Reinstall FOSSology dependencies on your recover system
  2. Reinstall FOSSology system (include scheduler, database, agent, web UI)
  3. Recover FOSSology configuration files from your system backup
  4. Run the shell script called //fo-restore// to implement restore steps, run it as root and implement only gold files backup use --onlygoldoption: *
    #sh fo-restore --onlygold 
    * Before you run the script, first edit the User variables for your environment: * BACKUPDIR=/home/fossy/backup # Directory on the backup server to store backup data * KEY=/srv/fossology/.ssh/id_dsa # ssh public key about backup user 'fossy' * SOURCEDIR=/srv/fossology/ # Source directory * PGBACKUPDIR=/srv/fossology/db # Directory to store database backup data * BACKUPUSER= # Backup user and hostname/ip * EXCLUDES=(./backup_include_a ./backup_include_b) # Exclude files to exclude or include backup directory and files * TEMPDIR=/tmp # temp directory to store database file * RESTOREDIR=current # sub directory under $BACKUPDIR to restore from * These variables should same as script fo-backup * Follow explain the shell scripts steps:
    - Make sure FOSSology scheduler not running
    - Restore gold files repository data and database backup file fo_dbbackup.gz from BACKUPDIR's RESTOREDIR *
     su fossy -c "$RSYNC -a --delete --exclude-from=\"$EXCLUDEDIR$EXCLUDE\" -e \"$SSH -i $KEY\" $BACKUP_USER:/$BACKUPDIR/$RESTOREDIR/ $SOURCEDIR" & 

    - Restore database
    - # mkdir $PGBACKUPDIR and have write permission to this directory
    - Restart postgresql and Remove old database *
    /etc/init.d/postgresql-8.3 restart
    su postgres -c "dropdb fossology"

    - Restore database as user “postgres” *
       gunzip -c $PGBACKUPDIR/fo_dbbackup.gz > $TEMPDIR/fo_dbbackup
       su postgres -c "psql -f $TEMPDIR/fo_dbbackup" 
       rm -rf $TEMPDIR/fo_dbbackup

    - Check the database to find if are there active jobs(license analysis or other jobs which needed repository files) running in the backup point; If there are active jobs will Queue reunpack job and add reunpack job as the dependency of active jobs * Add --onlygold option will run fo-restore.php script in fo-restore script
    - Start scheduler