Re: Something else about Redo Logs disappearing - Mailing list pgsql-general

From Peter
Subject Re: Something else about Redo Logs disappearing
Date
Msg-id 20200611093828.GA68382@gate.oper.dinoex.org
Whole thread Raw
In response to Re: Something else about Redo Logs disappearing  (Magnus Hagander <magnus@hagander.net>)
List pgsql-general
On Wed, Jun 10, 2020 at 01:10:36PM +0200, Magnus Hagander wrote:

! > Just having a look at their webpage, something seems to have been updated
! > recently, they now state that they have a new postgres adapter:
! >
! > https://www.bareos.com/en/company_news/postgres-plugin-en1.html
! > Enjoy reading, and tell us what You think.
! >
! 
! This one unfortunately rings out of somebody who doesn't know how to back
! up postgres, at least in the past 10-15 years.
! 
! They are using an API that has been deprecated for years - in what's
! announced as a brand new product. They are advocating local archiving,
! which basically guarantees dataloss in the event of a disaster.

Aye, thank You, that's exactly the impression I got. This is probably
still the old thing I was talking about, just made into a new product.
 
! That's from a 3 minute look, but that's definitely enough to suggest this
! is  not something I'd consider using.

The matter is, that backup software (as a whole, not this postgres
component) offers lots of things exactly as I like them. It is a great
concept, a great implementation, but a bad coding quality and a bad
maintenance policy. But then, one can get it for free; and I know
of no other with such features. So I went thru the effort of fixing
it up, so that it now well serves my needs - and use my own scripting
for the add-ons.
 
! > Well, Your own docs show how to do it with a one-liner. So please
! > don't blame me for improving that to 20 lines.
! >
! 
! Yes, those docs are unfortunately "known bad" and should definitely be
! improved on. it does very clearly state that the example is just an
! example. But it doesn't clearly state *why* it shouldn't be used.

That's why I felt the ethical need to speak up and share my
consideration. Now it's up to those in charge and not my issue
anymore. ;)

! In my understanding, backup is done via pgdump. The archive logs are
! > for emergencies (data corruption, desaster), only. And emergencies
! > would usually be handled by some professional people who know what
! > they have to do.
! >
! 
! I'd say it's the exact opposite. backups are done via pg_basebackup or
! manual basebackups. Archive logs are for point in time recovery. pg_dump
! can be used as a secondary "backup to the backups" option, but it is most
! interesting for things that are not backups (such as inspecting data, or
! provisioning partial test systems).
! 
! Different for different scenarios of course, but that would be the base
! scenario. And pg_dump are definitely as far from good backups are you can
! get while still having something that can be called approximately backups.
! It might be enough for small databases, but even in those cases
! pg_basebackup (without archive logging) is easier...

It's easier to create - but to apply? That depends on how many DBs are
in the cluster and how diverse their use. Also at any major version
switch these backups get worthless; one cannot use them for longterm.
(I suppose this is also true for pg_basebackup.)

I'm creating my longterm (and offsite) simply as clones from the regular
full backup. So what I came up with for now, is: I run pg_dump over all
the present databases, plus globals, chunk that up (in a similar way
like chunked HTTP works), feed it onto a pipe and backup that pipe. No
need for interim storage, so it can get as large as the backup
software can take it. And that should work for longterm - and I don't 
currently see a better option.

(This one does not work in 20 lines shellscript, because I didn't get
a reliable chunker running in shell.)

! And yes, I read that whole horrible discussion, and I could tear my
! > hair out, really, concerning the "deprecated API". I suppose You mean
! > the mentioning in the docs that the "exclusive low-level backup" is
! > somehow deprecated.
! >
! 
! Yes. There is no "somehow", it's deprecated.

Then lets not call it "somehow", as, more precisely, from my
understanding so far, that so called "new API" is ill-conceived and
troublesome in more than one regard. I would, with my current
knowledge, recommend to avoid, or better, abandon it.

Or, in other words: it is similar to what Boeing tried to do, in
forcing things upon people via software, for safety reasons - and
now see where Boeing got with that.

! > But now, with the now recommended "non-exclusive low-level backup",
! > the task is different: now your before-hook needs to do two things
! > at the same time:
! >  1. keep a socket open in order to hold the connection to postgres
! >     (because postgres will terminate the backup when the socket is
! >     closed), and
! >  2. invoke exit(0) (because the actual backup will not start until
! >     the before- hook has properly delivered a successful exit code.
! > And, that is not only difficult, it is impossible.
!
! It is not impossible. It is harder if you limit your available tools yes,
! but it also *works*.

In this description which I choose, I would think it is actually
impossible. Certainly there are other ways to achieve it. But I also
suppose that this is true: with the "new API" it is necessary to
resort to (some kind of) threaded programming in order to use it.

And properly handling threaded programming is significantly more
error-prone than straight procedural code. I don't see why this
should then be enforced in a case like this.

! It does not, no. It works in the simple cases, but it has multiple failure
! scenarios that *cannot* be fixed without changing those fundamentals.

Then please tell me at least something about these scenarios. Then
maybe one could think about some alternative approach that might suit
these needs and still be enjoyable.

! But you can always go for the actual old way -- just stop postgres in the
! pre-job and start it again in the post-job. That's by far the easiest. And
! that *does* work and is fully supported.

What? You seem to like ill jokes. Even if I would actually consider
that, it wouldn't work, because the backup software itself has some
database connections open during the backup. Not to talk about all the
other apps that would need to be restarted.

No, this has to be done in proper engineering, and with some beauty.

After reading that deprecation message in the doc, the first thing
I recognized was that this does NOT work in my current way with the
before- and after- hooks, and that it will require an ugly amount of
hackery and probably become unreliable when trying to make it work in
that way.

Then I got the idea that I could run pg_basebackup directly, and feed
it on a pipe in the same way as I do with the pg_dumps.
That one should work, as a kind of last resort. It is not sportsmanlike
- there is no fun in climbing the same mountain twice.

So currently I'm thinking about another option, that would actualize
a base backup in the form of a power loss (which would then be
transparent to the use of an API).

! > ! pg_probackup doesn't do row-level incremental backups, unless I've
! > ! missed some pretty serious change in its development, but it does
! > ! provide page-level,
! >
! > Ah, well, anyway that seems to be something significantly smaller
! > than the usual 1 gig table file at once.
! >
! 
! pg_probackup does page level incremental *if* you install a postgres
! extension that some people have questioned the wisdom of (disclaimer: I
! have not looked at this particular extension, so I cannot comment on said
! wisdom). I think it also has some ability to do page level incremental by
! scanning WAL. But the bottom line is it's always page level, it's never
! going to be row level, based on the fundamentals of how PostgreSQL works.

And a page is what I think it is - usually 8kB? That would have an
effect of comparable magnitude, and would be nice, *if* it works properly.
So thanks, I got the message and will search for the old discussion
messages before looking closer into it.

cheerio,
PMc



pgsql-general by date:

Previous
From: pgdba pgdba
Date:
Subject: pgbouncer logrotate found error, skipping
Next
From: Adrian Klaver
Date:
Subject: Re: CPU Configuration - postgres