Re: [PERFORM] performance problem on big tables - Mailing list pgsql-performance

From MichaelDBA
Subject Re: [PERFORM] performance problem on big tables
Date
Msg-id 599AE654.9010406@sqlexec.com
Whole thread Raw
In response to Re: [PERFORM] performance problem on big tables  (Mariel Cherkassky <mariel.cherkassky@gmail.com>)
Responses Re: [PERFORM] performance problem on big tables
List pgsql-performance
Maybe I missed it in this continuous thread activity, but have you tried '''ora2pg"?  You can export from Oracle and import to Postgres in parallel jobs.  The import commands use the efficient COPY command by default (unless you override it in the ora2pg configuration file).  You can do the export and subsequent import in memory, but I would suggest the actual file export and import so you can take advantage of the parallel feature.

Regards,
Michael Vitale

Monday, August 21, 2017 8:22 AM
I`m searching for a way to improve the current performance, I'm not interesting in using a different tool or writing something new because I'm trying to migrate a system on oracle database to a postgresql database.


Monday, August 21, 2017 4:37 AM

El 21 ago 2017, a las 10:00, Mariel Cherkassky <mariel.cherkassky@gmail.com> escribió:

To summarize, I still have performance problems. My current situation : 

I'm trying to copy the data of many tables in the oracle database into my postgresql tables. I'm doing so by running insert into local_postgresql_temp select * from remote_oracle_table. The performance of this operation are very slow and I tried to check the reason for that and mybe choose a different alternative.

1)First method - Insert into local_postgresql_table select * from remote_oracle_table this generated total disk write of 7 M/s and actual disk write of 4 M/s(iotop). For 32G table it took me 2 hours and 30 minutes.

2)second method - copy (select * from oracle_remote_table) to /tmp/dump generates total disk write of 4 M/s and actuval disk write of 100 K/s. The copy utility suppose to be very fast but it seems very slow.




Are you using a FDW to access oracle server and then dump it using copy? This is going to be slow, FDW isn't fast.


-When I run copy from the local dump, the reading is very fast 300 M/s.



You reported it was slow before. What has changed? How much does it take to load the 32G table then?


-I created a 32G file on the oracle server and used scp to copy it and it took me a few minutes.


-The wals directory is located on a different file system. The parameters I assigned :

min_parallel_relation_size = 200MB
max_parallel_workers_per_gather = 5 
max_worker_processes = 8 
effective_cache_size = 12GB
work_mem = 128MB
maintenance_work_mem = 4GB
shared_buffers = 2000MB
RAM : 16G
CPU CORES : 8

HOW can I increase the writes ? How can I get the data faster from the oracle database to my postgresql database?




Extract the table to a file in the oracle server in a format that the COPY utility can read, then copy it to postgres server and load it. You can even pipe commands and do it in a single step.

This is what I meant when I said that COPY is much faster than any thing else. To make it even faster, if I/O is not your bottleneck, you can chop the table in chunks and load it in parallel as I told you before, I have done this many times when migrating data from oracle to postgres. ora2pg uses this method to migrate data from oracle to postgres too. 


2017-08-20 14:00 GMT+03:00 Mariel Cherkassky <mariel.cherkassky@gmail.com>:
I realized something weird. When I`m preforming the copy utility of postgresql in order to create dump from a local table in my postgresql db it takes for 32G table 20 minutes. When I try to use copy for a foregin table (on oracle database) It takes more than 2 hours.. During the copy operation from the foreign table I dont see alot of write operations, with iotop i see that its writes 3 M/s. What else I can check ? 

2017-08-20 9:39 GMT+03:00 Mariel Cherkassky <mariel.cherkassky@gmail.com>:
This server is dedicated to be a postgresql production database, therefore postgresql is the only thing the runs on the server. The fs that I`m using is xfs. I`ll add two different disks - one for the wals and one for the temp tablespace. Regarding the disk, what size should they be considering that the database size is about 250G. Does 16G of ram considered little ? I installed iotop and I see that postgresql writer is writing most of the time and above all.

I mentioned that I perform alot of insert into table select * from table. Before that I remove indexes,constraints and truncate the table. Should I run vacuum before or after the operation ? 

2017-08-17 19:37 GMT+03:00 Claudio Freire <klaussfreire@gmail.com>:
On Thu, Aug 17, 2017 at 6:00 AM, Mariel Cherkassky
<mariel.cherkassky@gmail.com> wrote:
> I checked with the storage team in the company and they saw that I have alot
> of io on the server. How should I reduce the io that the postgresql uses ?

Do you have concurrent activity on that server?

What filesystem are you using wherever the data is sitting?

If you've got concurrent fsyncs happening, some filesystems handle
that poorly. When you've got WAL and data mixed in a single disk, or
worse, filesystem, it happens often that the filesystem won't handle
the write barriers for the WAL efficiently. I/O gets intermingled with
bulk operations, and even small fsyncs will have to flush writes from
bulk operations, which makes a mess of things.

It is a very good idea, and in fact a recommended practice, to put WAL
on its own disk for that reason mainly.

With that little RAM, you'll also probably cause a lot of I/O in temp
files, so I'd also recommend setting aside another disk for a temp
tablespace so that I/O doesn't block other transactions as well.

This is all assuming you've got concurrent activity on the server. If
not, install iotop and try to see who's causing that much I/O.


Monday, August 21, 2017 4:00 AM
To summarize, I still have performance problems. My current situation : 

I'm trying to copy the data of many tables in the oracle database into my postgresql tables. I'm doing so by running insert into local_postgresql_temp select * from remote_oracle_table. The performance of this operation are very slow and I tried to check the reason for that and mybe choose a different alternative.

1)First method - Insert into local_postgresql_table select * from remote_oracle_table this generated total disk write of 7 M/s and actual disk write of 4 M/s(iotop). For 32G table it took me 2 hours and 30 minutes.

2)second method - copy (select * from oracle_remote_table) to /tmp/dump generates total disk write of 4 M/s and actuval disk write of 100 K/s. The copy utility suppose to be very fast but it seems very slow.

-When I run copy from the local dump, the reading is very fast 300 M/s.

-I created a 32G file on the oracle server and used scp to copy it and it took me a few minutes.

-The wals directory is located on a different file system. The parameters I assigned :

min_parallel_relation_size = 200MB
max_parallel_workers_per_gather
= 5
max_worker_processes
= 8
effective_cache_size
= 12GB
work_mem
= 128MB
maintenance_work_mem
= 4GB
shared_buffers
= 2000MB
RAM
: 16G
CPU CORES
: 8

HOW can I increase the writes ? How can I get the data faster from the oracle database to my postgresql database?


Sunday, August 20, 2017 7:00 AM
I realized something weird. When I`m preforming the copy utility of postgresql in order to create dump from a local table in my postgresql db it takes for 32G table 20 minutes. When I try to use copy for a foregin table (on oracle database) It takes more than 2 hours.. During the copy operation from the foreign table I dont see alot of write operations, with iotop i see that its writes 3 M/s. What else I can check ? 

Sunday, August 20, 2017 2:39 AM
This server is dedicated to be a postgresql production database, therefore postgresql is the only thing the runs on the server. The fs that I`m using is xfs. I`ll add two different disks - one for the wals and one for the temp tablespace. Regarding the disk, what size should they be considering that the database size is about 250G. Does 16G of ram considered little ? I installed iotop and I see that postgresql writer is writing most of the time and above all.

I mentioned that I perform alot of insert into table select * from table. Before that I remove indexes,constraints and truncate the table. Should I run vacuum before or after the operation ? 



pgsql-performance by date:

Previous
From: Mariel Cherkassky
Date:
Subject: Re: [PERFORM] performance problem on big tables
Next
From: Mariel Cherkassky
Date:
Subject: Re: [PERFORM] performance problem on big tables