Re: Something else about Redo Logs disappearing - Mailing list pgsql-general
From | Peter |
---|---|
Subject | Re: Something else about Redo Logs disappearing |
Date | |
Msg-id | 20200611093828.GA68382@gate.oper.dinoex.org Whole thread Raw |
In response to | Re: Something else about Redo Logs disappearing (Magnus Hagander <magnus@hagander.net>) |
List | pgsql-general |
On Wed, Jun 10, 2020 at 01:10:36PM +0200, Magnus Hagander wrote: ! > Just having a look at their webpage, something seems to have been updated ! > recently, they now state that they have a new postgres adapter: ! > ! > https://www.bareos.com/en/company_news/postgres-plugin-en1.html ! > Enjoy reading, and tell us what You think. ! > ! ! This one unfortunately rings out of somebody who doesn't know how to back ! up postgres, at least in the past 10-15 years. ! ! They are using an API that has been deprecated for years - in what's ! announced as a brand new product. They are advocating local archiving, ! which basically guarantees dataloss in the event of a disaster. Aye, thank You, that's exactly the impression I got. This is probably still the old thing I was talking about, just made into a new product. ! That's from a 3 minute look, but that's definitely enough to suggest this ! is not something I'd consider using. The matter is, that backup software (as a whole, not this postgres component) offers lots of things exactly as I like them. It is a great concept, a great implementation, but a bad coding quality and a bad maintenance policy. But then, one can get it for free; and I know of no other with such features. So I went thru the effort of fixing it up, so that it now well serves my needs - and use my own scripting for the add-ons. ! > Well, Your own docs show how to do it with a one-liner. So please ! > don't blame me for improving that to 20 lines. ! > ! ! Yes, those docs are unfortunately "known bad" and should definitely be ! improved on. it does very clearly state that the example is just an ! example. But it doesn't clearly state *why* it shouldn't be used. That's why I felt the ethical need to speak up and share my consideration. Now it's up to those in charge and not my issue anymore. ;) ! In my understanding, backup is done via pgdump. The archive logs are ! > for emergencies (data corruption, desaster), only. And emergencies ! > would usually be handled by some professional people who know what ! > they have to do. ! > ! ! I'd say it's the exact opposite. backups are done via pg_basebackup or ! manual basebackups. Archive logs are for point in time recovery. pg_dump ! can be used as a secondary "backup to the backups" option, but it is most ! interesting for things that are not backups (such as inspecting data, or ! provisioning partial test systems). ! ! Different for different scenarios of course, but that would be the base ! scenario. And pg_dump are definitely as far from good backups are you can ! get while still having something that can be called approximately backups. ! It might be enough for small databases, but even in those cases ! pg_basebackup (without archive logging) is easier... It's easier to create - but to apply? That depends on how many DBs are in the cluster and how diverse their use. Also at any major version switch these backups get worthless; one cannot use them for longterm. (I suppose this is also true for pg_basebackup.) I'm creating my longterm (and offsite) simply as clones from the regular full backup. So what I came up with for now, is: I run pg_dump over all the present databases, plus globals, chunk that up (in a similar way like chunked HTTP works), feed it onto a pipe and backup that pipe. No need for interim storage, so it can get as large as the backup software can take it. And that should work for longterm - and I don't currently see a better option. (This one does not work in 20 lines shellscript, because I didn't get a reliable chunker running in shell.) ! And yes, I read that whole horrible discussion, and I could tear my ! > hair out, really, concerning the "deprecated API". I suppose You mean ! > the mentioning in the docs that the "exclusive low-level backup" is ! > somehow deprecated. ! > ! ! Yes. There is no "somehow", it's deprecated. Then lets not call it "somehow", as, more precisely, from my understanding so far, that so called "new API" is ill-conceived and troublesome in more than one regard. I would, with my current knowledge, recommend to avoid, or better, abandon it. Or, in other words: it is similar to what Boeing tried to do, in forcing things upon people via software, for safety reasons - and now see where Boeing got with that. ! > But now, with the now recommended "non-exclusive low-level backup", ! > the task is different: now your before-hook needs to do two things ! > at the same time: ! > 1. keep a socket open in order to hold the connection to postgres ! > (because postgres will terminate the backup when the socket is ! > closed), and ! > 2. invoke exit(0) (because the actual backup will not start until ! > the before- hook has properly delivered a successful exit code. ! > And, that is not only difficult, it is impossible. ! ! It is not impossible. It is harder if you limit your available tools yes, ! but it also *works*. In this description which I choose, I would think it is actually impossible. Certainly there are other ways to achieve it. But I also suppose that this is true: with the "new API" it is necessary to resort to (some kind of) threaded programming in order to use it. And properly handling threaded programming is significantly more error-prone than straight procedural code. I don't see why this should then be enforced in a case like this. ! It does not, no. It works in the simple cases, but it has multiple failure ! scenarios that *cannot* be fixed without changing those fundamentals. Then please tell me at least something about these scenarios. Then maybe one could think about some alternative approach that might suit these needs and still be enjoyable. ! But you can always go for the actual old way -- just stop postgres in the ! pre-job and start it again in the post-job. That's by far the easiest. And ! that *does* work and is fully supported. What? You seem to like ill jokes. Even if I would actually consider that, it wouldn't work, because the backup software itself has some database connections open during the backup. Not to talk about all the other apps that would need to be restarted. No, this has to be done in proper engineering, and with some beauty. After reading that deprecation message in the doc, the first thing I recognized was that this does NOT work in my current way with the before- and after- hooks, and that it will require an ugly amount of hackery and probably become unreliable when trying to make it work in that way. Then I got the idea that I could run pg_basebackup directly, and feed it on a pipe in the same way as I do with the pg_dumps. That one should work, as a kind of last resort. It is not sportsmanlike - there is no fun in climbing the same mountain twice. So currently I'm thinking about another option, that would actualize a base backup in the form of a power loss (which would then be transparent to the use of an API). ! > ! pg_probackup doesn't do row-level incremental backups, unless I've ! > ! missed some pretty serious change in its development, but it does ! > ! provide page-level, ! > ! > Ah, well, anyway that seems to be something significantly smaller ! > than the usual 1 gig table file at once. ! > ! ! pg_probackup does page level incremental *if* you install a postgres ! extension that some people have questioned the wisdom of (disclaimer: I ! have not looked at this particular extension, so I cannot comment on said ! wisdom). I think it also has some ability to do page level incremental by ! scanning WAL. But the bottom line is it's always page level, it's never ! going to be row level, based on the fundamentals of how PostgreSQL works. And a page is what I think it is - usually 8kB? That would have an effect of comparable magnitude, and would be nice, *if* it works properly. So thanks, I got the message and will search for the old discussion messages before looking closer into it. cheerio, PMc
pgsql-general by date: