Re: block-level incremental backup - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: block-level incremental backup
Date
Msg-id CAFiTN-v7YmMu2Xy3_5iuh09BKRVjyhfczJqPV0_C8emRmGk3Jw@mail.gmail.com
Whole thread Raw
In response to Re: block-level incremental backup  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: block-level incremental backup
List pgsql-hackers
On Tue, Sep 3, 2019 at 12:11 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Aug 16, 2019 at 3:54 PM Jeevan Chalke
> <jeevan.chalke@enterprisedb.com> wrote:
> >
> 0003:
> +/*
> + * When to send the whole file, % blocks modified (90%)
> + */
> +#define WHOLE_FILE_THRESHOLD 0.9
>
> How this threshold is selected.  Is it by some test?
>
>
> - magic number, currently 0 (4 bytes)
> I think in the patch we are using  (#define INCREMENTAL_BACKUP_MAGIC
> 0x494E4352) as a magic number, not 0
>
>
> + Assert(statbuf->st_size <= (RELSEG_SIZE * BLCKSZ));
> +
> + buf = (char *) malloc(statbuf->st_size);
> + if (buf == NULL)
> + ereport(ERROR,
> + (errcode(ERRCODE_OUT_OF_MEMORY),
> + errmsg("out of memory")));
> +
> + if ((cnt = fread(buf, 1, statbuf->st_size, fp)) > 0)
> + {
> + Bitmapset  *mod_blocks = NULL;
> + int nmodblocks = 0;
> +
> + if (cnt % BLCKSZ != 0)
> + {
>
> It will be good to add some comments for the if block and also for the
> assert. Actully, it's not very clear from the code.
>
> 0004:
> +#include <time.h>
> +#include <sys/stat.h>
> +#include <unistd.h>
> Header file include order (sys/state.h should be before time.h)
>
>
>
> + printf(_("%s combines full backup with incremental backup.\n\n"), progname);
> /backup/backups
>
>
> + * scan_file
> + *
> + * Checks whether given file is partial file or not.  If partial, then combines
> + * it into a full backup file, else copies as is to the output directory.
> + */
>
> /If partial, then combines/ If partial, then combine
>
>
>
> +static void
> +combine_partial_files(const char *fn, char **IncrDirs, int nIncrDir,
> +   const char *subdirpath, const char *outfn)
> + /*
> + * Open all files from all incremental backup directories and create a file
> + * map.
> + */
> + basefilefound = false;
> + for (i = (nIncrDir - 1), fmindex = 0; i >= 0; i--, fmindex++)
> + {
> + fm = &filemaps[fmindex];
> +
> .....
> + }
> +
> +
> + /* Process all opened files. */
> + lastblkno = 0;
> + modifiedblockfound = false;
> + for (i = 0; i < fmindex; i++)
> + {
> + char    *buf;
> + int hsize;
> + int k;
> + int blkstartoffset;
> ......
> + }
> +
> + for (i = 0; i <= lastblkno; i++)
> + {
> + char blkdata[BLCKSZ];
> + FILE    *infp;
> + int offset;
> ...
> + }
> }
>
> Can we breakdown this function in 2-3 functions.  At least creating a
> file map can directly go to a separate function.
>
> I have read 0003 and 0004 patch and there are few cosmetic comments.
>
 I have not yet completed the review for 0004, but I have few more
comments.  Tomorrow I will try to complete the review and some testing
as well.

1. It seems that the output full backup generated with
pg_combinebackup also contains the "INCREMENTAL BACKUP REFERENCE WAL
LOCATION".  It seems confusing
because now this is a full backup, not the incremental backup.

2.
+ FILE    *outfp;
+ FileOffset outblocks[RELSEG_SIZE];
+ int i;
+ FileMap    *filemaps;
+ int fmindex;
+ bool basefilefound;
+ bool modifiedblockfound;
+ uint32 lastblkno;
+ FileMap    *fm;
+ struct stat statbuf;
+ uint32 nblocks;
+
+ memset(outblocks, 0, sizeof(FileOffset) * RELSEG_SIZE);

I don't think you need to memset this explicitly as you can initialize
the array itself no?
FileOffset outblocks[RELSEG_SIZE] = {{0}}

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Unexpected "shared memory block is still in use"
Next
From: Peter Eisentraut
Date:
Subject: Re: REINDEX filtering in the backend