Thread: [PoC PATCH] Parallel dump to /dev/null

[PoC PATCH] Parallel dump to /dev/null

From
Michael Banck
Date:
Hi,

dumping a database to /dev/null via pg_dump is (AFAIK) one recommended
way to check for corruption.  However, dumping to /dev/null is currently
not supported in directory mode which makes it not possible to dump to
/dev/null in parallel.

I had a look at this, and it appears it would suffice to just override
the few spots in pg_backup_directory.c which assemble filenames in the
target directory to revert to '/dev/null' (or 'NUL' on Windows).

The attached proof-of-concept patch does that, and it seems to work; I'm
getting some nice speedups on a 170 GB test database:

$ time pg_dump -Fc -Z0 -f /dev/null TESTDB
real    32m45.120s
[...]
$ time pg_dump -Fd -j8 -Z0 -f /dev/null TESTDB
real    9m28.357s


Thoughts?

Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Attachment

Re: [PoC PATCH] Parallel dump to /dev/null

From
Michael Paquier
Date:
On Thu, Feb 01, 2018 at 02:24:46PM +0100, Michael Banck wrote:
> dumping a database to /dev/null via pg_dump is (AFAIK) one recommended
> way to check for corruption.  However, dumping to /dev/null is currently
> not supported in directory mode which makes it not possible to dump to
> /dev/null in parallel.

FWIW, I use this trick as well in some test deployments.  Now, those
deployments happen in places where I want the checks to warn me, so the
time it takes is not really an issue generally.  Do folks here think
that speedups would be worth it?  Say to make the operation shorter on
production systems for example.  A lengthy pg_dump bloats data for a
longer time, so I can buy that shorter times may help, though I think
that hearing from other folks is necessary as well.
--
Michael

Attachment

Re: [PoC PATCH] Parallel dump to /dev/null

From
Michael Banck
Date:
Hi,

Am Freitag, den 02.02.2018, 13:22 +0900 schrieb Michael Paquier:
> On Thu, Feb 01, 2018 at 02:24:46PM +0100, Michael Banck wrote:
> > dumping a database to /dev/null via pg_dump is (AFAIK) one recommended
> > way to check for corruption.  However, dumping to /dev/null is currently
> > not supported in directory mode which makes it not possible to dump to
> > /dev/null in parallel.
> 
> FWIW, I use this trick as well in some test deployments.  Now, those
> deployments happen in places where I want the checks to warn me, so the
> time it takes is not really an issue generally.  Do folks here think
> that speedups would be worth it?  Say to make the operation shorter on
> production systems for example.  A lengthy pg_dump bloats data for a
> longer time, so I can buy that shorter times may help, though I think
> that hearing from other folks is necessary as well.

Another use-case would be automatic restores of basebackups, where you
might want to dump to /dev/null afterwards to check for issues, as just
starting the basebackup wouldn't tell you. If you have lots of instances
and lots of basebackups to check, doing that in parallel might be
helpful.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


Re: [PoC PATCH] Parallel dump to /dev/null

From
Andrey Borodin
Date:
Hi!

Michael, thanks for the patch!

> 2 февр. 2018 г., в 9:22, Michael Paquier <michael.paquier@gmail.com> написал(а):
>
> On Thu, Feb 01, 2018 at 02:24:46PM +0100, Michael Banck wrote:
>> dumping a database to /dev/null via pg_dump is (AFAIK) one recommended
>> way to check for corruption.  However, dumping to /dev/null is currently
>> not supported in directory mode which makes it not possible to dump to
>> /dev/null in parallel.
>
> Do folks here think that speedups would be worth it?
I've seen backup verification via COPY TO to /dev/null in production. These copies were parallelized on the side of
verificationscript. 
Not sure this counts for feature or against it.

Best regards, Andrey Borodin.

Re: [PoC PATCH] Parallel dump to /dev/null

From
Vladimir Borodin
Date:
Hi.

2 февр. 2018 г., в 13:25, Andrey Borodin <x4mmm@yandex-team.ru> написал(а):

Do folks here think that speedups would be worth it?
I've seen backup verification via COPY TO to /dev/null in production. These copies were parallelized on the side of verification script.
Not sure this counts for feature or against it.

This sounds for feature because usage of COPY is involuntary. Having this in pg_dump would make our scripts for checking backups simpler.

--
May the force be with you…
https://simply.name

Re: [PoC PATCH] Parallel dump to /dev/null

From
Michael Banck
Date:
Hi,

On Thu, Feb 01, 2018 at 02:24:46PM +0100, Michael Banck wrote:
> dumping a database to /dev/null via pg_dump is (AFAIK) one recommended
> way to check for corruption.  However, dumping to /dev/null is currently
> not supported in directory mode which makes it not possible to dump to
> /dev/null in parallel.
> 
> I had a look at this, and it appears it would suffice to just override
> the few spots in pg_backup_directory.c which assemble filenames in the
> target directory to revert to '/dev/null' (or 'NUL' on Windows).
> 
> The attached proof-of-concept patch does that, and it seems to work; I'm
> getting some nice speedups on a 170 GB test database:
> 
> $ time pg_dump -Fc -Z0 -f /dev/null TESTDB
> real    32m45.120s
> [...]
> $ time pg_dump -Fd -j8 -Z0 -f /dev/null TESTDB
> real    9m28.357s

I have added this patch to the next commitfest:

https://commitfest.postgresql.org/17/1576/

I also tried to add a TAP test, but as this patch produces no dump
output, I had to exclude it from all restores tests and then got an
off-by-one between the number of tests that were expected vs. those that
were run. I've attached it if somebody wants to take a look, but I hope
with Stephen's refactoring of the pg_dump TAP tests, this might be
easier.


Michael

Attachment

Re: [PoC PATCH] Parallel dump to /dev/null

From
Andres Freund
Date:
Hi,

On 2018-02-28 14:28:41 +0100, Michael Banck wrote:
> I have added this patch to the next commitfest:
> 
> https://commitfest.postgresql.org/17/1576/

Given this is a new patch, submitted for the last commitfest, and not
completely trivial, I'd argue this is too late for v11.

Does anybody disagree?

- Andres


Re: [PoC PATCH] Parallel dump to /dev/null

From
Michael Banck
Date:
Hi,

Am Donnerstag, den 01.03.2018, 01:28 -0800 schrieb Andres Freund:
> Hi,
> 
> On 2018-02-28 14:28:41 +0100, Michael Banck wrote:
> > I have added this patch to the next commitfest:
> > 
> > https://commitfest.postgresql.org/17/1576/
> 
> Given this is a new patch, submitted for the last commitfest, and not
> completely trivial, I'd argue this is too late for v11.

I was under the impression that rule was rather about complicated and
invasive patches, not just any patch which isn't completely trivial.

I agree that the patch is not completely trivial, but is rather small
(it's diffstat is "3 files changed, 38 insertions(+), 9 deletions(-)"),
and it certainly is not highly invasive, or touching code all over the
place, but really only in a few places in pg_backup_directory.c.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


Re: [PoC PATCH] Parallel dump to /dev/null

From
Andres Freund
Date:
On 2018-03-01 14:21:24 +0100, Michael Banck wrote:
> Hi,
> 
> Am Donnerstag, den 01.03.2018, 01:28 -0800 schrieb Andres Freund:
> > Hi,
> > 
> > On 2018-02-28 14:28:41 +0100, Michael Banck wrote:
> > > I have added this patch to the next commitfest:
> > > 
> > > https://commitfest.postgresql.org/17/1576/
> > 
> > Given this is a new patch, submitted for the last commitfest, and not
> > completely trivial, I'd argue this is too late for v11.
> 
> I was under the impression that rule was rather about complicated and
> invasive patches, not just any patch which isn't completely trivial.

I think you're right this patch is a bit boderline for that. But we have
~220 pending CF entries, and fairly limited review and commit
capacity. Something's gotta give.  A lot of those are patches that have
been waiting to be committed for a while. So things added first to the
last CF the day before it starts are prime candidates.

You probably can increase your chances by herding a few patches towards
being committable.

Greetings,

Andres Freund


Re: [PoC PATCH] Parallel dump to /dev/null

From
Alvaro Herrera
Date:
strdup -> pg_strdup()

I wonder if, instead of doing strcmp() all over the place, we should
give this behavior a separate boolean flag in lclContext.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PoC PATCH] Parallel dump to /dev/null

From
Michael Banck
Date:
Hi Alvaro,

On Thu, Mar 01, 2018 at 04:00:52PM -0300, Alvaro Herrera wrote:
> strdup -> pg_strdup()

Fixed.

> I wonder if, instead of doing strcmp() all over the place, we should
> give this behavior a separate boolean flag in lclContext.

I mused a bit about what to name that flag and came up with `discard',
but maybe somebody has a better idea?

In any case, new patch attached which does this.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Attachment

Re: [PoC PATCH] Parallel dump to /dev/null

From
"Andreas 'ads' Scherbaum"
Date:


On Mon, Mar 5, 2018 at 1:49 PM, Michael Banck <michael.banck@credativ.de> wrote:
 
In any case, new patch attached which does this.

The null device is already defined in port.h, as DEVNULL. No need to redefine it as NULL_DEVICE.
Alternatively paths.h defines _PATH_DEVNULL.


Regards,

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project

Re: [PoC PATCH] Parallel dump to /dev/null

From
Michael Banck
Date:
Hi Andreas,

On Mon, Mar 05, 2018 at 03:15:19PM +0100, Andreas 'ads' Scherbaum wrote:
> The null device is already defined in port.h, as DEVNULL. No need to
> redefine it as NULL_DEVICE.

Thanks for pointing that out, a new patch removing NULL_DEVICE is
attached.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Attachment

Re: [PoC PATCH] Parallel dump to /dev/null

From
Alvaro Herrera
Date:
Please make ctx->discard a bool, not an int.

You can initialize it like this:
    ctx->discard = strcmp(ctx->directory, DEVNULL) == 0;

Why do you change the mode to create directory to 0711 from 0700?

You have a cuddled "else" in the last hunk.  Please uncuddle.



-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PoC PATCH] Parallel dump to /dev/null

From
Michael Banck
Date:
Hi Alvaro,

Am Montag, den 05.03.2018, 12:48 -0300 schrieb Alvaro Herrera:
> Please make ctx->discard a bool, not an int.

Ok, done so now.  I forgot to mention that in the earlier resubmission,
but I had a look at the other pg_backup_*.c files and isSpecialScript
and hasSeek were ints as well, so I opted for that.

> You can initialize it like this:
>     ctx->discard = strcmp(ctx->directory, DEVNULL) == 0;

Thanks, changed it thusly.

> Why do you change the mode to create directory to 0711 from 0700?

Oh wow, that was either a mistype in vim or a very old test hack, thanks
for spotting it.

> You have a cuddled "else" in the last hunk.  Please uncuddle.

Done so now, hopefully.

Thanks again for the review, a new version is attached.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
Attachment

Re: [PoC PATCH] Parallel dump to /dev/null

From
Alvaro Herrera
Date:
I made a few amendments (here's v5) and was ready to push, when I
noticed that _StartBlobs() does not seem to be doing the right thing.
Did you test this with a few large objects?

The reason I noticed is I wondered about the amount of open() calls
(plus zlib function calls) we would save by keeping an FD open to
/dev/null, rather than opening it over and over for each object -- ie.,
maybe not touch setFilePath() at all, if possible.  That looks perhaps
too invasive, so maybe not.  But do audit other callsites that may open
files.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: [PoC PATCH] Parallel dump to /dev/null

From
Michael Banck
Date:
Hi Alvaro,

Am Montag, den 05.03.2018, 13:56 -0300 schrieb Alvaro Herrera:
> I made a few amendments (here's v5) and was ready to push, when I
> noticed that _StartBlobs() does not seem to be doing the right thing.
> Did you test this with a few large objects?

I did test it on the regression database, which leaves one or two large
objects behind:


mba@fock:~$ psql -h /tmp -p 65432 -d regression -c '\dl'
                Large objects
  ID   | Owner |         Description          
-------+-------+------------------------------
  3001 | mba   | testing comments
 36867 | mba   | I Wandered Lonely as a Cloud
 47229 | mba   | 
(3 rows)

What exactly is wrong with _StartBlobs()? It calls setFilePath() which
makes sure /dev/null is opened and not something else.

> The reason I noticed is I wondered about the amount of open() calls
> (plus zlib function calls) we would save by keeping an FD open to
> /dev/null, rather than opening it over and over for each object -- ie.,
> maybe not touch setFilePath() at all, if possible.  That looks perhaps
> too invasive, so maybe not.  But do audit other callsites that may open
> files.

I see your point; I've hacked together the attached additional PoC
patch, which keeps the dataFH open. The parallel case was tricky, I had
to add an additional flag to lclContext so that DEVNULL is only opened
once for data files cause I could not find a place where to set it for
parallel workers and it crashed otherwise.

This cuts down the number of /dev/null opens from 352 to 6 for a -j 2
dump of the regression database to /dev/null.

In my opinion, I think this is too evolved and possibly error-prone for
being just an optimization, so I'd drop that for v11 for now and maybe
revisit it later.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
Attachment

Re: [PoC PATCH] Parallel dump to /dev/null

From
Christoph Berg
Date:
Re: Alvaro Herrera 2018-03-05 <20180305165609.kq5y7uzy64o45vsg@alvherre.pgsql>
> The reason I noticed is I wondered about the amount of open() calls
> (plus zlib function calls) we would save by keeping an FD open to
> /dev/null, rather than opening it over and over for each object -- ie.,
> maybe not touch setFilePath() at all, if possible.  That looks perhaps
> too invasive, so maybe not.  But do audit other callsites that may open
> files.

In normal operation, the files are also opened individually, so
special-casing /dev/null seems overly specific to me (and I'd be very
surprised if it made any difference.)

For the feature to be useful in practise, it needs documentation.
People won't expect -Fd -f /dev/null to work because /dev/null is not
a directory.

Otherwise, +1 from me.
Christoph


Re: [PoC PATCH] Parallel dump to /dev/null

From
Michael Paquier
Date:
On Tue, Mar 20, 2018 at 09:54:07AM +0100, Christoph Berg wrote:
> Otherwise, +1 from me.

I have been thinking about this patch lately, and on the contrary I am
voting -1 for the concepts behind this patch.  pg_dump is by nature a
tool aimed at fetching data from the database, at putting it in a shape
wanted by the user, and optionally at saving the data and at making it
durable (since recently for the last part).
It is not a corruption detection tool.  Even if it was a tool to detect
corruption, it is doing it wrong in two ways:
1) It scans tables using only sequential scans, so it basically never
checks any other AMs than heap.
2) It detects only one failure at a time and stops.  Hence in order to
detect all corruptions, one need to run pg_dump, repair or zero the
pages and then rince and repeat until a successful run is achieved.
This is a costly process particularly on large relations, where a run of
pg_dump can take minutes, and the more the pages, the more time it takes
to do the whole cleanup before being able to save as much data as
possible.

Now, why are people using pg_dump > /dev/null?  Mainly the lack of
better tools, which would be actually able to detect pages in corrupted
pages in one run, and not only heap pages.  I honestly think that
amcheck is something that we sould more focus on and has more potential
on the matter, and that we are just complicating pg_dump to do something
it is not designed for, and would do it badly anyway.
--
Michael

Attachment

Re: [PoC PATCH] Parallel dump to /dev/null

From
Christoph Berg
Date:
Re: Michael Paquier 2018-03-20 <20180320135720.GE13368@paquier.xyz>
> Now, why are people using pg_dump > /dev/null?  Mainly the lack of
> better tools, which would be actually able to detect pages in corrupted
> pages in one run, and not only heap pages.  I honestly think that
> amcheck is something that we sould more focus on and has more potential
> on the matter, and that we are just complicating pg_dump to do something
> it is not designed for, and would do it badly anyway.

Still, people are using it for that purpose now, and speeding it up
would just be a non-intrusive patch.

Also, if pg_dump is a backup tool, why does that mean it must not be
used for anything else?

Mit freundlichen Grüßen,
Christoph Berg
-- 
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970  87C6 4C5A 6BAB 12D2 A7AE


Re: [PoC PATCH] Parallel dump to /dev/null

From
Tom Lane
Date:
Christoph Berg <christoph.berg@credativ.de> writes:
> Re: Michael Paquier 2018-03-20 <20180320135720.GE13368@paquier.xyz>
>> Now, why are people using pg_dump > /dev/null?  Mainly the lack of
>> better tools, which would be actually able to detect pages in corrupted
>> pages in one run, and not only heap pages.  I honestly think that
>> amcheck is something that we sould more focus on and has more potential
>> on the matter, and that we are just complicating pg_dump to do something
>> it is not designed for, and would do it badly anyway.

FWIW, I've been wondering the same thing about this patch.  I hadn't
bothered to actually read the thread up to now, but the title was
enough to make me think "do we really need that?"

> Still, people are using it for that purpose now, and speeding it up
> would just be a non-intrusive patch.

If it were non-intrusive, that would be OK, but after a quick look
at the latest patch I can't say I find it so.  It seems to add a bunch
of poorly-defined additional states to each pg_dump format module.

It might help if the patch were less enthusiastic about trying to
"optimize" by avoiding extra file opens/closes in the case of output
to /dev/null.  That seems to account for a lot of the additional
complication, and I can't see a reason that it'd be worth it.

            regards, tom lane


Re: [PoC PATCH] Parallel dump to /dev/null

From
Christoph Berg
Date:
Re: Tom Lane 2018-03-20 <12960.1521557852@sss.pgh.pa.us>
> It might help if the patch were less enthusiastic about trying to
> "optimize" by avoiding extra file opens/closes in the case of output
> to /dev/null.  That seems to account for a lot of the additional
> complication, and I can't see a reason that it'd be worth it.

Note that the last patch was just a PoC to check if the extra
open/close could be avoided. The "real" patch is the 2nd last.

Mit freundlichen Grüßen,
Christoph Berg
-- 
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970  87C6 4C5A 6BAB 12D2 A7AE


Re: [PoC PATCH] Parallel dump to /dev/null

From
Stephen Frost
Date:
Greetings,

* Christoph Berg (christoph.berg@credativ.de) wrote:
> Re: Tom Lane 2018-03-20 <12960.1521557852@sss.pgh.pa.us>
> > It might help if the patch were less enthusiastic about trying to
> > "optimize" by avoiding extra file opens/closes in the case of output
> > to /dev/null.  That seems to account for a lot of the additional
> > complication, and I can't see a reason that it'd be worth it.
>
> Note that the last patch was just a PoC to check if the extra
> open/close could be avoided. The "real" patch is the 2nd last.

Even so, I'm really not a fan of this patch either.  If we could do this
in a general way where we supported parallel mode with output to stdout
or to a file and then that file could happen to be /dev/null, I'd be
more interested because it's at least reasonable for someone to want
that beyond using pg_dump to (poorly) check for corruption.

As it is, this is an extremely special case which may even end up being
confusing for users (I can run a parallel pg_dump to /dev/null, but not
to a regular file?!).

Instead of trying to use pg_dump for this, we should provide a way to
actually check for corruption across everything (instead of just the
heap..), and have all detected corruption reported in one pass.

One of the things that I really like about PostgreSQL is that we
typically try to provide appropriate tools for the job and avoid just
hacking things to give users a half-solution, which is what this seems
like to me.  Let's write a proper tool (perhaps as a background
worker..?) to scan the database (all of it...) which will find and
report corruption anywhere.  That'll take another release to do, but
hopefully pushing back on this will encourage that to happen, whereas
allowing this in would actively discourage someone from writing a proper
tool and we would be much worse off for that.

Thanks!

Stephen

Attachment

Re: [PoC PATCH] Parallel dump to /dev/null

From
Michael Banck
Date:
Hi,

Am Dienstag, den 20.03.2018, 19:19 -0400 schrieb Stephen Frost:
> * Christoph Berg (christoph.berg@credativ.de) wrote:
> > Re: Tom Lane 2018-03-20 <12960.1521557852@sss.pgh.pa.us>
> > > It might help if the patch were less enthusiastic about trying to
> > > "optimize" by avoiding extra file opens/closes in the case of output
> > > to /dev/null.  That seems to account for a lot of the additional
> > > complication, and I can't see a reason that it'd be worth it.
> > 
> > Note that the last patch was just a PoC to check if the extra
> > open/close could be avoided. The "real" patch is the 2nd last.
> 
> Even so, I'm really not a fan of this patch either.  If we could do this
> in a general way where we supported parallel mode with output to stdout
> or to a file and then that file could happen to be /dev/null, I'd be
> more interested because it's at least reasonable for someone to want
> that beyond using pg_dump to (poorly) check for corruption.

What you are saying is you want parallel Dump for the custom format, not
just the directory format. Sure, I want that too, but that is not what
this patch is about, it does not change the custom format in any way.

But I agree that this would be a useful feature.

> As it is, this is an extremely special case which may even end up being
> confusing for users (I can run a parallel pg_dump to /dev/null, but not
> to a regular file?!).

This patch allows to treat the NUL device as a directory (which it
normally isn't) in the directory format. So it addresses the "I can
'pg_dump -Fc -f /dev/null', but not 'pg_dump -Fd -f /dev/null'?!"
question. 

That it allows for a parallel dump to /dev/null is merely a side-effect
of this, so you could just as well name the patch "Support dumping to
/dev/null in directory format as well".

I honestly didn't know before I wrote that patch whether 'pg_dump -Fd -f
/dev/null' might not just work and the error message you get ('could not
create directory "/dev/null": File exists') is a bit meh (but logical
from the point of view of pg_dump, of course).

> Instead of trying to use pg_dump for this, we should provide a way to
> actually check for corruption across everything (instead of just the
> heap..), and have all detected corruption reported in one pass.

Well, I agree that we should provide a more general tool as well.

> That'll take another release to do, but hopefully pushing back on this
> will encourage that to happen, whereas allowing this in would actively
> discourage someone from writing a proper tool and we would be much
> worse off for that.

To be honest, I don't buy that argument. Having had log shipping and
warm standbys did not eventually stop us from implementing proper
streaming replication. I would of course welcome the above tool, but I
probably won't work on that for v12.

I think what will happen is that everybody interested is just continuing
to dump to /dev/null sequentially, or use external hacks with ram disks,
nullfs FUSE drivers or scripts like https://github.com/cybertec-postgres
ql/scripts/blob/master/quick_verify.py .



Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


Re: [PoC PATCH] Parallel dump to /dev/null

From
Peter Geoghegan
Date:
On Tue, Mar 20, 2018 at 6:57 AM, Michael Paquier <michael@paquier.xyz> wrote:
> Now, why are people using pg_dump > /dev/null?  Mainly the lack of
> better tools, which would be actually able to detect pages in corrupted
> pages in one run, and not only heap pages.  I honestly think that
> amcheck is something that we sould more focus on and has more potential
> on the matter, and that we are just complicating pg_dump to do something
> it is not designed for, and would do it badly anyway.

+1.

There are a wide variety of things that pg_dump will not check when
used as a smoke-test for heap corruption. Things that could be checked
without doing any more I/O will not be checked. For example, we won't
test for sane xmin/xmax fields across update chains, even though the
additional cost of doing that is pretty modest. Also, we won't make
any attempt to validate NOT NULL constraints. I've noticed that if you
corrupt an ItemId entry in a heap page, that will tend to result in
the row seeming to consist of all-NULLs to expression evaluation.
Typically, the pg_dump trick only detects abject corruption, such as a
totally corrupt page header.

amcheck isn't there yet, of course. The heapallindexed enhancement
that's in the current CF will help, though. And, it won't be very hard
to adapt heapallindexed verification to take place in parallel, now
that IndexBuildHeapScan() can handle parallel heap scans.

I do still think that we need an amcheck function that specifically
targets a heap relation (not an index), and performs verification
fairly quickly, so my heapallindexed patch isn't enough. That wouldn't
share much with the existing amcheck verification functions. I hope
that someone else can pick that up soon.

-- 
Peter Geoghegan


Re: [PoC PATCH] Parallel dump to /dev/null

From
Michael Paquier
Date:
On Wed, Mar 21, 2018 at 09:07:41AM +0100, Michael Banck wrote:
> Am Dienstag, den 20.03.2018, 19:19 -0400 schrieb Stephen Frost:
>> Instead of trying to use pg_dump for this, we should provide a way to
>> actually check for corruption across everything (instead of just the
>> heap..), and have all detected corruption reported in one pass.
>
> Well, I agree that we should provide a more general tool as well.

A background worker is one piece of the puzzle, and while automatic scan
and checks would be nice, I don't think that is is mandatory as one
could as well have a script running as a cron job.  My point is: you are
going to need a generic tool able to do the sanity checks and the
scanning, the way to invoke or run the tool is just more sugar on top of
the cake.

Note that there is still a patch pending for amcheck for add support for
heap checks, which is based on a bloom filter method.  Impossible to say
if this is going to make it to upstream for v11.  For the time being
Peter G has worked on a more advanced tool which is basically using the
patch submitted and maintains it here:
https://github.com/petergeoghegan/amcheck

The module has been renamed to amcheck_next to not conflict with what
upstream has since v10.

Hence, is there any objection to mark this patch as returned with
feedback?  Corruption tools are definitely wanted, but not in this shape
per the feedback gathered on this thread.
--
Michael

Attachment

Re: [PoC PATCH] Parallel dump to /dev/null

From
Michael Banck
Date:
Hi,

Am Donnerstag, den 22.03.2018, 10:33 +0900 schrieb Michael Paquier:
> Hence, is there any objection to mark this patch as returned with
> feedback?  

I've done so now.


Cheers,

Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer


Re: [PoC PATCH] Parallel dump to /dev/null

From
Michael Paquier
Date:
On Thu, Mar 22, 2018 at 08:42:05AM +0100, Michael Banck wrote:
> Am Donnerstag, den 22.03.2018, 10:33 +0900 schrieb Michael Paquier:
>> Hence, is there any objection to mark this patch as returned with
>> feedback?
>
> I've done so now.

Thanks Michael.
--
Michael

Attachment