Re: COPY TO and VACUUM - Mailing list pgsql-performance

From Roberto Grandi
Subject Re: COPY TO and VACUUM
Date
Msg-id 519968828.69982.1378966451438.JavaMail.root@trovaprezzi.it
Whole thread Raw
In response to Re: COPY TO and VACUUM  (Jeff Janes <jeff.janes@gmail.com>)
Responses Re: COPY TO and VACUUM  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-performance
Hi Guys,

we found a suitable solution for our process we run every 5-6 hours a CLUSTER stement for our big table: this "lock"
activitiesbut allow us to recover all available space. 

When testing this task we discover another issues and that's why I'm coming back to you for your experience:

duting our process we run multiple simoultaneously "COPY... FROM" in order to load data into our table but a t the same
timewe run also "COPY ... TO" statement in parallel to export data for other clients. 

We found that COPY .. TO queries sometimes are pending for more than 100 minutes and the destination file continues to
beat 0 Kb. Can you advise me how to solve this issue? 
Is it here a best way to bulk download data avoiding any kind of block when running in parallel?

Many thanks in advance


----- Messaggio originale -----
Da: "Jeff Janes" <jeff.janes@gmail.com>
A: "Roberto Grandi" <roberto.grandi@trovaprezzi.it>
Cc: "Kevin Grittner" <kgrittn@ymail.com>, pgsql-performance@postgresql.org
Inviato: Giovedì, 5 settembre 2013 20:14:26
Oggetto: Re: [PERFORM] COPY TO and VACUUM

On Thu, Sep 5, 2013 at 9:05 AM, Roberto Grandi
<roberto.grandi@trovaprezzi.it> wrote:
> Hi Jeff,
>
> the proble is that when continously updloading vendors listing on our "big" table the autovacuum is not able to free
spaceas we would. 

It might not be able to free it (to be reused) as fast as you need it
to, but it should be freeing it eventually.

> Secondarly, if we launch a Vacuum after each "upload" we collide with other upload taht are running in parallel.

I wouldn't do a manual vacuum after *each* upload.  Doing one after
every Nth upload, where N is estimated to make up about 1/5 of the
table, should be good.  You are probably IO limited, so you probably
don't gain much by running these uploads in parallel, I would try to
avoid that.  But in any case, there shouldn't be a collision between
manual vacuum and a concurrent upload.  There would be one between two
manual vacuums but you could code around that by explicitly locking
the table in the correct mode nowait or with a timeout, and skipping
the vacuum if it can't get the lock.

>
> Is it possible, form your point of view, working with isolation levels or table partitioning to minimize table space
growing?

Partitioning by vendor might work well for that purpose.

Cheers,

Jeff


pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Performance bug in prepared statement binding in 9.2?
Next
From: Maximilian Tyrtania
Date:
Subject: Re: slow sort