RE: Add column with default value in big table - splitting of updatescan help? - Mailing list pgsql-general

From Kevin Brannen
Subject RE: Add column with default value in big table - splitting of updatescan help?
Date
Msg-id SN6PR19MB2272970C02CD4F5ED535A7BFA4030@SN6PR19MB2272.namprd19.prod.outlook.com
Whole thread Raw
In response to Re: Add column with default value in big table - splitting of updatescan help?  (Durumdara <durumdara@gmail.com>)
List pgsql-general
>From: Durumdara <durumdara@gmail.com>
>
>a.)
>PG version is mainly 9.6, but some of the servers are 10.x or 11.x.
>b.)
>We have semi-automatic tool which get the a new modifications on databases, and execute them at once by database.
>So one SQL script by one database, under one transaction - whole or nothing. If something failed, we know where to we
startagain by hand. It is impossible to execute only the first half, and we don't know which one executed or not.
 

Unless you have some special requirement, you don't have to do it all or
nothing. As Despez points out, you can do it in multiple transactions just
fine. We do it that way here all the time. :)

What it means is that you have to write guards or checks in your upgrade script.
In some instances, it's very easy because some statements have IF NOT EXISTS to
help you. For those places where doing the same statement twice would cause an
error, then put a check around it. A plpgsql DO block allows for conditions,
then only if the condition check fails, you do the work (the information_schema and
pg_catalog tables are your friend for this).

The point is that you can run your upgrade script as many times as needed, should
something happen and it stops. Rerunning the upgrade script should never
cause an error if you've coded it correctly.

>
>The main problem that sometimes we have to modify some tables which have too much records in some customer databases.

I'm going to go with everyone else here because it works. We tend to do updates
in blocks of 10K records at a time. Do some tests and figure out what works best
for your setup (maybe you have enough memory to do 100K chunks). Whatever you
do, make sure that the column you use to divide the work has an index on it! Use
the primary key if it's an INT. If you have no index, find a column you can create
an index on for this work then drop it at the end; that will be far faster than having
to do a bunch of table scans.

Yes, it's probably a very good idea to upgrade to a newer version if you can
as performance improvements come with each new version.

HTH,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages attached to it, may contain confidential
information.If you are not the intended recipient, or a person responsible for delivering it to the intended recipient,
youare hereby notified that any disclosure, distribution, review, copy or use of any of the information contained in or
attachedto this message is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately
notifyus by reply e-mail, and destroy the original transmission and its attachments without reading them or saving them
todisk. Thank you.
 

pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Postgres Crashing
Next
From: Doug Roberts
Date:
Subject: Re: Postgres Crashing