Re: speed up full table scan using psql - Mailing list pgsql-general

From David Rowley
Subject Re: speed up full table scan using psql
Date
Msg-id CAApHDvp7p_DL4+Gqc+MqKNUvG6E1LrKJ6koD5s=QTjbeTSKqsg@mail.gmail.com
Whole thread Raw
In response to speed up full table scan using psql  (Lian Jiang <jiangok2006@gmail.com>)
Responses Re: speed up full table scan using psql
List pgsql-general
On Wed, 31 May 2023 at 16:26, Lian Jiang <jiangok2006@gmail.com> wrote:
> I am using psql to periodically dump the postgres tables into json files which are imported into snowflake. For large
tables(e.g. 70M rows), it takes hours for psql to complete. Using spark to read the postgres table seems not to work as
thepostgres read only replication is the bottleneck so spark cluster never uses >1 worker node and the working node
timeoutor out of memory. 
>
> Will vertical scaling the postgres db speed up psql? Or any thread related parameter of psql can help? Thanks for any
hints.

This is probably slow due to psql's alignment.  It needs to read the
entire result to know how much to whitespace to pad columns with
before it can output anything. Naturally, that requires quite a bit of
memory when large tables are queried and also lots of length checking.

As mentioned by Rob, you'll probably want to use COPY, or you could
look at using pg_dump.  Something like:

pg_dump --inserts --rows-per-insert=100 --table=<name of your table to
export> --data-only <database name>

should give you something that you might have some luck directly
importing to Snowflake without any transformation tool required.

However, if you do need to do some transformation before loading, then
you might want to do that in PostgreSQL.  For that, something like:

psql -c "copy (select col1,col2,col3+col4 from your_table) to stdout"
<name of database>

would allow you to run a query, which you could maybe do your
transformations in before importing into Snowflake

David



pgsql-general by date:

Previous
From: Rob Sargent
Date:
Subject: Re: speed up full table scan using psql
Next
From: Lian Jiang
Date:
Subject: Re: speed up full table scan using psql