On 04/11/2018 11:15 AM, Don Seiler wrote:
> Let's say I have two DBs: main (9.6.6) and archive (10.2).
>
> I have a table in main where I want to archive data older then 60 days.
> For various reasons, the table is not partitioned, so for now we must use
> DELETE. The destination table in the archive DB is partitioned with the
> new Pg10 partitioning.
>
> My initial plan was to have a stored procedure on the archive DB use
> postgres_fdw to do an INSERT INTO / SELECT to select the data remotely and
> insert into the local archive table. It would then issue a single DELETE
> command to remotely remove the data from the main DB. However I found that
> doing this resulted in the main DB calling thousands (perhaps millions if
> it's one-per-row) of individual DELETE statements based on a ctid column.
> Aside from WAL behavior concerns, it is flooding my postgresql server logs
> since I log any DML.
>
> On top of that, I'm told that a remote DELETE wouldn't be transactional,
> so if I were to compare inserted rows vs deleted rows and found a
> mismatch, I couldn't just rollback the DELETE. I plan to verify this with
> a small test case later but for now I'll assume this to be true.
>
> Right now I'm thinking of falling back to the far-less-elegant method of
> dumping the data to a flat file via COPY, running psql to connect to the
> archive DB remotely and running a COPY to load the data (or maybe
> transferring the flat file to the archive DB to load it there, offloading
> that part of the workload), then deleting the data from the main DB. I
> could capture the rows dumped in a control table and compare the rows
> deleted against that and then rollback the delete if necessary.
>
> Like I said, not elegant, but I don't want to risk losing data that wasn't
> successfully archived to the archive DB. I'm very interested to hear what
> others might be doing for tasks like this.
It might not be elegant, but a COPY / DELETE / LOAD is granular, so you can
restart at any point.
--
Angular momentum makes the world go 'round.