Thread: Query Plan Performance on Partitioned Table

Query Plan Performance on Partitioned Table

From

Rural Hunter

Date:

11 August 2015, 11:46:40

Platform: pg 9.2.9 on Ubuntu 12.04.4 LTS.
I have a table which is partitioned to about 80 children. There are usualy several dozens of connections accessing these tables concurrently. I found sometimes the query planing time is very long if I query against the parent table with partition key. The connections are shown with status 'BIND' by ps command.

In normal condition, the plan time of the query is about several hundred of million seconds while the same query accessing child table directly is less than 1 million seconds:
# explain select 1 from article where cid=729 and url_hash='6851f596f55a994b2df417b53523fe45';
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
Result (cost=0.00..8.68 rows=2 width=0)
   -> Append (cost=0.00..8.68 rows=2 width=0)
         -> Seq Scan on article (cost=0.00..0.00 rows=1 width=0)
               Filter: ((cid = 729) AND (url_hash = '6851f596f55a994b2df417b53523fe45'::bpchar))
         -> Index Scan using article_729_url_hash on article_729 article (cost=0.00..8.68 rows=1 width=0)
               Index Cond: (url_hash = '6851f596f55a994b2df417b53523fe45'::bpchar)
               Filter: (cid = 729)
(7 rows)

Time: 361.401 ms

# explain select 1 from article_729 where url_hash='6851f596f55a994b2df417b53523fe45';
                                         QUERY PLAN
---------------------------------------------------------------------------------------------
Index Only Scan using article_729_url_hash on article_729 (cost=0.00..8.67 rows=1 width=0)
   Index Cond: (url_hash = '6851f596f55a994b2df417b53523fe45'::bpchar)
(2 rows)

Time: 0.898 ms

This is only in normal condition. In extreme condition, the planing time could take several minutes. There seems some locking issue in query planing. How can I increase the plan performance? Or is it bad to partition table to 80 children in PostgreSQL?

Re: Query Plan Performance on Partitioned Table

From

Maxim Boguk

Date:

11 August 2015, 14:44:25

On Tue, Aug 11, 2015 at 6:46 PM, Rural Hunter <ruralhunter@gmail.com> wrote:

Platform: pg 9.2.9 on Ubuntu 12.04.4 LTS.
I have a table which is partitioned to about 80 children. There are usualy several dozens of connections accessing these tables concurrently. I found sometimes the query planing time is very long if I query against the parent table with partition key. The connections are shown with status 'BIND' by ps command.

In normal condition, the plan time of the query is about several hundred of million seconds while the same query accessing child table directly is less than 1 million seconds:
# explain select 1 from article where cid=729 and url_hash='6851f596f55a994b2df417b53523fe45';
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
Result (cost=0.00..8.68 rows=2 width=0)
   -> Append (cost=0.00..8.68 rows=2 width=0)
         -> Seq Scan on article (cost=0.00..0.00 rows=1 width=0)
               Filter: ((cid = 729) AND (url_hash = '6851f596f55a994b2df417b53523fe45'::bpchar))
         -> Index Scan using article_729_url_hash on

article_729 article (cost=0.00..8.68 rows=1 width=0)
               Index Cond: (url_hash = '6851f596f55a994b2df417b53523fe45'::bpchar)
               Filter: (cid = 729)
(7 rows)

Time: 361.401 ms

# explain select 1 from article_729 where url_hash='6851f596f55a994b2df417b53523fe45';
                                         QUERY PLAN
---------------------------------------------------------------------------------------------
Index Only Scan using article_729_url_hash on article_729 (cost=0.00..8.67 rows=1 width=0)
   Index Cond: (url_hash = '6851f596f55a994b2df417b53523fe45'::bpchar)
(2 rows)

Time: 0.898 ms

This is only in normal condition. In extreme condition, the planing time could take several minutes. There seems some locking issue in query planing. How can I increase the plan performance? Or is it bad to partition table to 80 children in PostgreSQL?

​Hi,

Could you provide full definition of ​​article_729 table (\dt+ ​article_729)?

80 partitions is adequate amount of partitions for the PostgreSQL, so there are going something unusual (I suspect it may be related to used partitioning schema).

Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/

Re: Query Plan Performance on Partitioned Table

From

Rural Hunter

Date:

11 August 2015, 16:44:20

2015-08-11 19:43 GMT+08:00 Maxim Boguk <maxim.boguk@gmail.com>:

On Tue, Aug 11, 2015 at 6:46 PM, Rural Hunter <ruralhunter@gmail.com> wrote:
Platform: pg 9.2.9 on Ubuntu 12.04.4 LTS.
I have a table which is partitioned to about 80 children. There are usualy several dozens of connections accessing these tables concurrently. I found sometimes the query planing time is very long if I query against the parent table with partition key. The connections are shown with status 'BIND' by ps command.

In normal condition, the plan time of the query is about several hundred of million seconds while the same query accessing child table directly is less than 1 million seconds:
# explain select 1 from article where cid=729 and url_hash='6851f596f55a994b2df417b53523fe45';
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
Result (cost=0.00..8.68 rows=2 width=0)
   -> Append (cost=0.00..8.68 rows=2 width=0)
         -> Seq Scan on article (cost=0.00..0.00 rows=1 width=0)
               Filter: ((cid = 729) AND (url_hash = '6851f596f55a994b2df417b53523fe45'::bpchar))
         -> Index Scan using article_729_url_hash on

article_729 article (cost=0.00..8.68 rows=1 width=0)
               Index Cond: (url_hash = '6851f596f55a994b2df417b53523fe45'::bpchar)
               Filter: (cid = 729)
(7 rows)

Time: 361.401 ms

# explain select 1 from article_729 where url_hash='6851f596f55a994b2df417b53523fe45';
                                         QUERY PLAN
---------------------------------------------------------------------------------------------
Index Only Scan using article_729_url_hash on article_729 (cost=0.00..8.67 rows=1 width=0)
   Index Cond: (url_hash = '6851f596f55a994b2df417b53523fe45'::bpchar)
(2 rows)

Time: 0.898 ms

This is only in normal condition. In extreme condition, the planing time could take several minutes. There seems some locking issue in query planing. How can I increase the plan performance? Or is it bad to partition table to 80 children in PostgreSQL?

Hi,

Could you provide full definition of article_729 table (\dt+ article_729)?
80 partitions is adequate amount of partitions for the PostgreSQL, so there are going something unusual (I suspect it may be related to used partitioning schema).

--
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/

Re: Query Plan Performance on Partitioned Table

From

Maxim Boguk

Date:

11 August 2015, 16:53:56

On Tue, Aug 11, 2015 at 11:44 PM, Rural Hunter <ruralhunter@gmail.com> wrote:

# \dt+

article_729
List of relations
Schema | Name | Type | Owner | Size | Description
--------+-------------+-------+--------+--------+-------------
public | article_729 | table | omuser1 | 655 MB |
(1 row)
The problem exists on not only this specific child table, but with all of them.

​Oops sorry, оf course I mean "\d+​

article_729

​" (to see criteria used for partitioning).​

Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/

Re: Query Plan Performance on Partitioned Table

From

Rural Hunter

Date:

11 August 2015, 17:01:01

2015-08-11 21:53 GMT+08:00 Maxim Boguk <maxim.boguk@gmail.com>:

On Tue, Aug 11, 2015 at 11:44 PM, Rural Hunter <ruralhunter@gmail.com> wrote:
# \dt+

article_729
List of relations
Schema | Name | Type | Owner | Size | Description
--------+-------------+-------+--------+--------+-------------
public | article_729 | table | omuser1 | 655 MB |
(1 row)
The problem exists on not only this specific child table, but with all of them.

Oops sorry, оf course I mean "\d+
article_729
" (to see criteria used for partitioning).

--
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/

Re: Query Plan Performance on Partitioned Table

From

Rural Hunter

Date:

11 August 2015, 17:01:50

2015-08-11 22:00 GMT+08:00 Rural Hunter <ruralhunter@gmail.com>:

# \d article_729
                                     Table "public.article_729"
    Column    |            Type             |                       Modifiers
--------------+-----------------------------+-------------------------------------------------------
aid          | bigint                      | not null default nextval('article_aid_seq'::regclass)
style        | smallint                    | not null default 0
oaid         | bigint                      | default 0
fid          | integer                     |
bid          | integer                     | default 0
cid          | integer                     |
tid          | integer                     |
url          | text                        | default NULL::bpchar
tm_post      | timestamp without time zone |
tm_last_rply | timestamp without time zone |
author       | character varying(100)      | default NULL::bpchar
title        | character varying(255)      | default NULL::bpchar
content      | text                        |
ab_content   | text                        |
rply_cnt     | integer                     |
read_cnt     | integer                     |
url_hash     | character(32)               | not null
hash_plain   | text                        | default NULL::bpchar
title_hash   | character(32)               | default NULL::bpchar
guid         | character(32)               | default NULL::bpchar
neg_pos      | smallint                    | not null default 0
match_code   | character(32)               | default NULL::bpchar
tm_spider    | timestamp without time zone |
tm_update    | timestamp without time zone |
stage        | smallint                    | not null default 0
rply_cut     | integer                     | not null default 0
read_cut     | integer                     | not null default 0
src          | integer                     | default 0
rfid         | integer                     |
labels       | integer[]                   |
kwds         | integer[]                   |
like_cnt     | integer                     |
Indexes:
    "article_729_pkey" PRIMARY KEY, btree (aid), tablespace "indextbs"
    "article_729_url_hash" UNIQUE CONSTRAINT, btree (url_hash), tablespace "indextbs"
    "article_729_bid_titlehash_idx" btree (bid, title_hash), tablespace "indextbs"
    "article_729_fid_idx" btree (fid), tablespace "indextbs"
    "article_729_guid_idx" btree (guid), tablespace "indextbs"
    "article_729_labels_idx" gin (labels), tablespace "data1tbs"
    "article_729_mtcode_idx" btree (match_code), tablespace "indextbs"
    "article_729_rfid_author_idx" btree (rfid, author), tablespace "indextbs"
    "article_729_stage_idx" btree (stage), tablespace "data1tbs"
    "article_729_time_style_idx" btree (tm_post DESC, style), tablespace "data1tbs"
    "article_729_tm_spider_idx" btree (tm_spider), tablespace "indextbs"
    "article_729_tm_update_idx" btree (tm_update), tablespace "data1tbs"
Check constraints:
    "article_729_cid_check" CHECK (cid = 729)
Foreign-key constraints:
    "article_729_cid_fk" FOREIGN KEY (cid) REFERENCES company(cid) ON DELETE CASCADE
Triggers:
    trg_article_729_delete AFTER DELETE ON article_729 FOR EACH ROW EXECUTE PROCEDURE fn_article_delete()
    trg_article_729_insert AFTER INSERT ON article_729 FOR EACH ROW EXECUTE PROCEDURE fn_article_insert()
    trg_article_729_update AFTER UPDATE ON article_729 FOR EACH ROW EXECUTE PROCEDURE fn_article_update()
Inherits: article

2015-08-11 21:53 GMT+08:00 Maxim Boguk <maxim.boguk@gmail.com>:

On Tue, Aug 11, 2015 at 11:44 PM, Rural Hunter <ruralhunter@gmail.com> wrote:
# \dt+

article_729
                      List of relations
Schema |    Name     | Type | Owner | Size | Description
--------+-------------+-------+--------+--------+-------------
public | article_729 | table | omuser1 | 655 MB |
(1 row)
The problem exists on not only this specific child table, but with all of them.

Oops sorry, оf course I mean "\d+
article_729
" (to see criteria used for partitioning).

--
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/

Re: Query Plan Performance on Partitioned Table

From

Maxim Boguk

Date:

11 August 2015, 17:43:06

Check constraints:
"article_729_cid_check" CHECK (cid = 729)

Used partition schema looks very simple and straightforward, and should have no issues with 80 partitions.
Are you sure that you have only 80 partitions but not (lets say) 800?

Are every other partition of the article table use the same general idea of partition check (cid=something)?

Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com
МойКруг: http://mboguk.moikrug.ru/

"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."

Re: Query Plan Performance on Partitioned Table

From

Rural Hunter

Date:

11 August 2015, 17:51:08

yes i'm very sure. from what i observed, it has something to do with the concurrent query planing. if i disconnect other connections, the plan is very quick.

2015-08-11 22:42 GMT+08:00 Maxim Boguk <maxim.boguk@gmail.com>:

Check constraints:
"article_729_cid_check" CHECK (cid = 729)

Used partition schema looks very simple and straightforward, and should have no issues with 80 partitions.
Are you sure that you have only 80 partitions but not (lets say) 800?
Are every other partition of the article table use the same general idea of partition check (cid=something)?

Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com
МойКруг: http://mboguk.moikrug.ru/

"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."

Re: Query Plan Performance on Partitioned Table

From

Pietro Pugni

Date:

11 August 2015, 20:03:31

Hi Rural Hunter,
Try to create an index on cid attribute.
How many rows has article_729?

Pietro Pugni

Il 11/ago/2015 16:51, "Rural Hunter" <ruralhunter@gmail.com> ha scritto:

yes i'm very sure. from what i observed, it has something to do with the concurrent query planing. if i disconnect other connections, the plan is very quick.

2015-08-11 22:42 GMT+08:00 Maxim Boguk <maxim.boguk@gmail.com>:

Check constraints:
"article_729_cid_check" CHECK (cid = 729)

Used partition schema looks very simple and straightforward, and should have no issues with 80 partitions.
Are you sure that you have only 80 partitions but not (lets say) 800?
Are every other partition of the article table use the same general idea of partition check (cid=something)?

Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com
МойКруг: http://mboguk.moikrug.ru/

"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."

Re: Query Plan Performance on Partitioned Table

From

Rural Hunter

Date:

12 August 2015, 04:49:49

article_729 has about 0.8 million rows. The rows of the children tables are variance from several thousands to dozens of millions. How can it help to create index on the partition key?

2015-08-12 1:03 GMT+08:00 Pietro Pugni <pietro.pugni@gmail.com>:

Hi Rural Hunter,
Try to create an index on cid attribute.
How many rows has article_729?

Pietro Pugni
Il 11/ago/2015 16:51, "Rural Hunter" <ruralhunter@gmail.com> ha scritto:
yes i'm very sure. from what i observed, it has something to do with the concurrent query planing. if i disconnect other connections, the plan is very quick.

2015-08-11 22:42 GMT+08:00 Maxim Boguk <maxim.boguk@gmail.com>:

Check constraints:
"article_729_cid_check" CHECK (cid = 729)

Used partition schema looks very simple and straightforward, and should have no issues with 80 partitions.
Are you sure that you have only 80 partitions but not (lets say) 800?
Are every other partition of the article table use the same general idea of partition check (cid=something)?

Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com
МойКруг: http://mboguk.moikrug.ru/

"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."

Re: Query Plan Performance on Partitioned Table

From

Pietro Pugni

Date:

12 August 2015, 10:00:53

You can give it a try only on that partition just to see if your query plan gets better. I prefer defining partitioning over ranging attributes like, for example: cid between 123 and 456. It makes more sense, especially when there are attributes which value strictly depends on the check attribute. Btw, dozens of millions is not a problem on modern systems. I remember of reading about a recommended 20 millions per partition but I usually work with 60 millions per partition without any problem.

Do you autovacuum? How frequently do the updates and insert operations occur?
Give us your configuration about work_mem, shared_buffers, max_connections etc. Kernel version? If possible avoid 3.2 and 3.8-3.13. Also think to upgrade your OS version.

From today I'm on vacancy, so others could help :)

Pietro Pugni

Il 12/ago/2015 03:49, "Rural Hunter" <ruralhunter@gmail.com> ha scritto:

article_729 has about 0.8 million rows. The rows of the children tables are variance from several thousands to dozens of millions. How can it help to create index on the partition key?

2015-08-12 1:03 GMT+08:00 Pietro Pugni <pietro.pugni@gmail.com>:
Hi Rural Hunter,
Try to create an index on cid attribute.
How many rows has article_729?

Pietro Pugni
Il 11/ago/2015 16:51, "Rural Hunter" <ruralhunter@gmail.com> ha scritto:
yes i'm very sure. from what i observed, it has something to do with the concurrent query planing. if i disconnect other connections, the plan is very quick.

2015-08-11 22:42 GMT+08:00 Maxim Boguk <maxim.boguk@gmail.com>:

Check constraints:
"article_729_cid_check" CHECK (cid = 729)

Used partition schema looks very simple and straightforward, and should have no issues with 80 partitions.
Are you sure that you have only 80 partitions but not (lets say) 800?
Are every other partition of the article table use the same general idea of partition check (cid=something)?

Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com
МойКруг: http://mboguk.moikrug.ru/

"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."

Re: Query Plan Performance on Partitioned Table

From

Rural Hunter

Date:

12 August 2015, 11:25:46

I tried to add index on partition key and it didn't help. we have autovacuum running. The updates and inserts are very frequent on these tables. The server kernel version is 3.5.0-22-generic. It has 376G memory.

max_connections = 2500            # (change requires restart)
shared_buffers = 32GB            # min 128kB
work_mem = 8MB                # min 64kB
maintenance_work_mem = 20GB        # min 1MB

We usually have around 400 active connections on the db. Most of them are idle. There are about 100 connections are in active status and I can see most of the time they are in 'BIND' status in ps command.

We have heavy IO load on the disk of the default tablespace where I believe table statistics tables are in. Will that impact the query planing greatly?

2015-08-12 15:00 GMT+08:00 Pietro Pugni <pietro.pugni@gmail.com>:

You can give it a try only on that partition just to see if your query plan gets better. I prefer defining partitioning over ranging attributes like, for example: cid between 123 and 456. It makes more sense, especially when there are attributes which value strictly depends on the check attribute. Btw, dozens of millions is not a problem on modern systems. I remember of reading about a recommended 20 millions per partition but I usually work with 60 millions per partition without any problem.
Do you autovacuum? How frequently do the updates and insert operations occur?
Give us your configuration about work_mem, shared_buffers, max_connections etc. Kernel version? If possible avoid 3.2 and 3.8-3.13. Also think to upgrade your OS version.
From today I'm on vacancy, so others could help :)
Pietro Pugni
Il 12/ago/2015 03:49, "Rural Hunter" <ruralhunter@gmail.com> ha scritto:
article_729 has about 0.8 million rows. The rows of the children tables are variance from several thousands to dozens of millions. How can it help to create index on the partition key?

2015-08-12 1:03 GMT+08:00 Pietro Pugni <pietro.pugni@gmail.com>:
Hi Rural Hunter,
Try to create an index on cid attribute.
How many rows has article_729?

Pietro Pugni
Il 11/ago/2015 16:51, "Rural Hunter" <ruralhunter@gmail.com> ha scritto:
yes i'm very sure. from what i observed, it has something to do with the concurrent query planing. if i disconnect other connections, the plan is very quick.

2015-08-11 22:42 GMT+08:00 Maxim Boguk <maxim.boguk@gmail.com>:

Check constraints:
"article_729_cid_check" CHECK (cid = 729)

Used partition schema looks very simple and straightforward, and should have no issues with 80 partitions.
Are you sure that you have only 80 partitions but not (lets say) 800?
Are every other partition of the article table use the same general idea of partition check (cid=something)?

Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim.boguk@gmail.com
МойКруг: http://mboguk.moikrug.ru/

"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."