Thread: question on error during COPY FROM
Hello,
in the documentation I read
COPY stops operation at the first error. This should not lead to problems in the event of a COPY TO, but the target table will already have received earlier rows in a COPY FROM. These rows will not be visible or accessible, but they still occupy disk space. This might amount to a considerable amount of wasted disk space if the failure happened well into a large copy operation. You might wish to invoke VACUUM to recover the wasted space.
does that mean that I should always execute a VACUUM to recover the wasted space when an error is triggered or will the auto-vacuum mechanism do the job by itself ?
Thanks
Jerome
Hi. >does that mean that I should always execute a VACUUM to recover the >wasted space when an error is triggered or will the auto-vacuum mechanism >do the job by itself ? If you have autovacuum enabled it will clean up tablespace. However, space will not be returned to filesystem but will bereused by database. You may run VACUUM FULL manually to return it to filesystem. https://www.postgresql.org/docs/9.1/static/routine-vacuuming.html Ilya Kazakevich JetBrains http://www.jetbrains.com The Drive to Develop
On Tue, Aug 23, 2016 at 2:32 PM, Ilya Kazakevich <Ilya.Kazakevich@jetbrains.com> wrote: >>does that mean that I should always execute a VACUUM to recover the >>wasted space when an error is triggered or will the auto-vacuum mechanism >>do the job by itself ? > If you have autovacuum enabled it will clean up tablespace. However, space will not be returned to filesystem but willbe reused by database. > You may run VACUUM FULL manually to return it to filesystem. A normal vacuum may also return some space, specially after a big bulk load, see second paragraph of 23.1.2 the URL you posted: > https://www.postgresql.org/docs/9.1/static/routine-vacuuming.html Where it says "However, it will not return the space to the operating system, except in the special case where one or more pages at the end of a table become entirely free and an exclusive table lock can be easily obtained.". A big aborted bulk load may just fit the case, as it may put a lot of tuples at new pages at the end and be executed in a low-load period where the lock is easier to acquire. Francisco Olarte.
Is it true that one datafile in PG can only belong to one object (table/index) On Tue, Aug 23, 2016 at 9:55 AM, Francisco Olarte <folarte@peoplecall.com> wrote: > On Tue, Aug 23, 2016 at 2:32 PM, Ilya Kazakevich > <Ilya.Kazakevich@jetbrains.com> wrote: >>>does that mean that I should always execute a VACUUM to recover the >>>wasted space when an error is triggered or will the auto-vacuum mechanism >>>do the job by itself ? >> If you have autovacuum enabled it will clean up tablespace. However, space will not be returned to filesystem but willbe reused by database. >> You may run VACUUM FULL manually to return it to filesystem. > > A normal vacuum may also return some space, specially after a big bulk > load, see second paragraph of 23.1.2 the URL you posted: >> https://www.postgresql.org/docs/9.1/static/routine-vacuuming.html > > Where it says "However, it will not return the space to the operating > system, except in the special case where one or more pages at the end > of a table become entirely free and an exclusive table lock can be > easily obtained.". A big aborted bulk load may just fit the case, as > it may put a lot of tuples at new pages at the end and be executed in > a low-load period where the lock is easier to acquire. > > > Francisco Olarte. > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
On 08/23/2016 07:06 AM, Rakesh Kumar wrote: > Is it true that one datafile in PG can only belong to one object (table/index) Yes, assuming by datafile you mean an on disk file. Though one object may have many in disk files associated with it: https://www.postgresql.org/docs/9.5/static/storage-file-layout.html Details start just under Table 63-1. Contents of PGDATA > > On Tue, Aug 23, 2016 at 9:55 AM, Francisco Olarte > <folarte@peoplecall.com> wrote: >> On Tue, Aug 23, 2016 at 2:32 PM, Ilya Kazakevich >> <Ilya.Kazakevich@jetbrains.com> wrote: >>>> does that mean that I should always execute a VACUUM to recover the >>>> wasted space when an error is triggered or will the auto-vacuum mechanism >>>> do the job by itself ? >>> If you have autovacuum enabled it will clean up tablespace. However, space will not be returned to filesystem but willbe reused by database. >>> You may run VACUUM FULL manually to return it to filesystem. >> >> A normal vacuum may also return some space, specially after a big bulk >> load, see second paragraph of 23.1.2 the URL you posted: >>> https://www.postgresql.org/docs/9.1/static/routine-vacuuming.html >> >> Where it says "However, it will not return the space to the operating >> system, except in the special case where one or more pages at the end >> of a table become entirely free and an exclusive table lock can be >> easily obtained.". A big aborted bulk load may just fit the case, as >> it may put a lot of tuples at new pages at the end and be executed in >> a low-load period where the lock is easier to acquire. >> >> >> Francisco Olarte. >> >> >> -- >> Sent via pgsql-general mailing list (pgsql-general@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general > > -- Adrian Klaver adrian.klaver@aklaver.com
On Tue, Aug 23, 2016 at 4:06 PM, Rakesh Kumar <rakeshkumar464a3@gmail.com> wrote: > Is it true that one datafile in PG can only belong to one object (table/index) If this is a question, yes, AFAIK ( in fact they are split in 1G chunks to prevent problems with quirky filesystems ). Search for "Each table and index is stored in a separate file" in https://www.postgresql.org/docs/9.5/static/storage-file-layout.html A full periodic read, even if superficial, of the postgres manual, is quite beneficial. I feel poetic today ;-> . Seriously, is quite well written and full of interesting information, even if skipping large chunks knowing where the info is can sava you a lot of work and mails. AAMOF, it's one of the main reasons I've been using postgres all this years. Francisco Olarte.
On Tue, Aug 23, 2016 at 6:04 AM, Jerome Wagner <jerome.wagner@laposte.net> wrote:
Hello,in the documentation I readCOPY stops operation at the first error. This should not lead to problems in the event of a COPY TO, but the target table will already have received earlier rows in a COPY FROM. These rows will not be visible or accessible, but they still occupy disk space. This might amount to a considerable amount of wasted disk space if the failure happened well into a large copy operation. You might wish to invoke VACUUM to recover the wasted space.does that mean that I should always execute a VACUUM to recover the wasted space when an error is triggered or will the auto-vacuum mechanism do the job by itself ?ThanksJerome
>but the target table will already have received earlier rows in a COPY FROM
No, that is not what happens. The entire copy is not permanently written to disk until a COMMIT or the entire statement is completed (;). On first error, a ROLLBACK is done so there is no wasted space. That being said, it is always good practice to schedule a VACUUM ANALYZE (but not FULL) in the off hours to insure a cleanup and fresh statistics. --
Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.