Re: Performance on Bulk Insert to Partitioned Table - Mailing list pgsql-performance

From Jeff Janes
Subject Re: Performance on Bulk Insert to Partitioned Table
Date
Msg-id CAMkU=1x0PpAjCe-C7nG=pxDWeVs3=9wLLQtKdsheOxmaJzK_7Q@mail.gmail.com
Whole thread Raw
In response to Re: Performance on Bulk Insert to Partitioned Table  (Charles Gomes <charlesrg@outlook.com>)
Responses Re: Performance on Bulk Insert to Partitioned Table  (Charles Gomes <charlesrg@outlook.com>)
List pgsql-performance


On Thursday, December 20, 2012, Charles Gomes wrote:
Jeff,

The 8288 writes are fine, as the array has a BBU, it's fine. You see about 4% of the utilization.

BBU is great for latency, but it doesn't do much for throughput, unless it is doing write combining behind the scenes.  Is it HDD or SSD behind the BBU?  Have you bench-marked it on randomly scattered 8k writes?

I've seen %util reports that were low while watching a strace showed obvious IO freezes.  So I don't know how much faith to put into low %util.  

 

To target directly instead of doing :
INSERT INTO TABLE VALUES ()
I use:
INSERT INTO TABLE_PARTITION_01 VALUES()

But how is it deciding what partition to use?  Does it have to re-decide for every row, or does each thread serve only one partition throughout its life and so makes the decision only once?

 

By targeting it I see a huge performance increase.

But is that because by targeting you are by-passing the the over-head of triggers, or is it because you are loading the rows in an order which leads to more efficient index maintenance?

 
I haven't tested using 1Billion rows in a single table. The issue is that in the future it will grow to more than 1 billion rows, it will get to about 4Billion rows and that's when I believe partition would be a major improvement.

The way that partitioning gives you performance improvements is by you embracing the partitioning, for example by targeting the loading to just one partition without any indexes, creating indexes, and then atomically attaching it to the table.  If you wish to have partitions, but want to use triggers to hide that partitioning from you, then I don't think you can expect to get much of a speed up through using partitions.

Any way, the way I would approach it would be to load to a single un-partitioned table, and also load to a single dummy-partitioned table which uses a trigger that looks like the one you want to use for real, but directs all rows to a single partition.  If these loads take the same time, you know it is not the trigger which is limiting.

Cheers,

Jeff

pgsql-performance by date:

Previous
From: Richard Neill
Date:
Subject: Re: Why does the query planner use two full indexes, when a dedicated partial index exists?
Next
From: Jeff Janes
Date:
Subject: Re: Why does the query planner use two full indexes, when a dedicated partial index exists?