On 4/2/20 9:42 PM, James Brauman wrote:
> I am using the COPY command to insert 10 million rows from a CSV file
> into a database table and I am finding the performance is
> unacceptable. When the COPY command is executed the disk I/O is
> extremely high which leads to degraded query performance for other
> queries being executed on the database.
>
> I have tried removing the indexes from the database table and this
> dramatically improved performance (both reducing the execution time
> and reducing disk I/O).
Is there the option to use tablespaces to move the index(s) to another
disk?:
https://www.postgresql.org/docs/12/sql-alterindex.html
"
SET TABLESPACE
This form changes the index's tablespace to the specified
tablespace and moves the data file(s) associated with the index to the
new tablespace. ..."
>
> The constraints I am working with are:
> - Indexes must be in place to ensure that other queries that are
> executed on the table while the COPY command is executing have
> acceptable performance.
> - The table should not be locked for reads during the COPY (it is
> fine to lock for writes).
>
> I don't know how COPY is implemented, but it seems that the indexes
> are updated as data is inserted into the table. Ideally I would like
> to delay updating the indexes until the COPY command has completed and
> just update the indexes a single time, I think this would lead to much
> better performance. Is this possible?
>
> Another idea that I had was to start a transaction, drop the indexes,
> run the COPY command and then recreate the indexes. The problem with
> this approach is that DROP INDEX acquires an exclusive lock on the
> table, which is not acceptable in this system (otherwise other queries
> won't be able to execute). I tried using DROP INDEX CONCURRENTLY, but
> that cannot be executed inside a transaction.
>
> What are my options in this scenario?
>
> Thanks
>
>
--
Adrian Klaver
adrian.klaver@aklaver.com