Thread: partitioned table set and indexes
I'm using PostgreSQL 9.5 Beta 2.
I am working with a partitioned table set.
The first thing I noticed, when creating indexes on the 20 or so partitions, was that if I create them too fast they don't all succeed. I have to do a few at a time, let them breathe for a few seconds, and then do a few more. I had been simply generating all of the create index commands in a text editor, and then cutting and pasting the lot of them into psql all at once or running them by using psql '-f'. Most would get created, but not all. It seems almost random. There were no obvious error messages. When I do a few at a time, it is never an issue.
This tripped me up because I couldn't figure out why some of the child tables were sequence scanning and some were not. It turned out that some of the partitions were missing some of the indexes. I'm mentioning it here just in case someone else is observing strange behaviour where some children are scanning and some aren't. You might not have all of your indexes deployed correctly.
--
Anyway, the issue I am trying to figure out at the moment:
If I do a simple query with a where clause on a specific column from the parent table, I can see it index scan each of the children. This is what I want it to do, so no complaints there.
However, if I try to (inner) join another table with that column, the planner sequence scans each of the children instead of using the indexes. I saw someone had posted a similar question to this list back in January, however I didn't see the answer.
What should I look at to try to figure out why a join doesn't use the indexes while a straight query on the same column for the table does?
FWIW, the column in question is a UUID column and is the primary key for each of the child tables.
--
Rick.
Rick Otten <rottenwindfish@gmail.com> wrote: > I'm using PostgreSQL 9.5 Beta 2. > > I am working with a partitioned table set. > > The first thing I noticed, when creating indexes on the 20 or so partitions, > was that if I create them too fast they don't all succeed. I have to do a few > at a time, let them breathe for a few seconds, and then do a few more. I had > been simply generating all of the create index commands in a text editor, and > then cutting and pasting the lot of them into psql all at once or running them > by using psql '-f'. Most would get created, but not all. It seems almost > random. There were no obvious error messages. When I do a few at a time, it > is never an issue. Sure? Have you checked that? > If I do a simple query with a where clause on a specific column from the parent > table, I can see it index scan each of the children. This is what I want it to > do, so no complaints there. > > However, if I try to (inner) join another table with that column, the planner > sequence scans each of the children instead of using the indexes. I saw > someone had posted a similar question to this list back in January, however I > didn't see the answer. Show us the output from explain analyse <your query> > FWIW, the column in question is a UUID column and is the primary key for each > of the child tables. PostgreSQL using a cost-modell, so maybe there are not enough rows in the table. That's just a guess, you can see that with explain analyse ... Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknown) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
I do not know why if I blast a new index creation on the 20 or so children all at once some of them fail, but then if I go back and do a few at a time they all work. It has happened to me 3 times now, so I'm pretty sure I'm not imagining it.
Why would the select statement use the index, but not the join?
There used to be an explain output anonymizer tool, if I can find that again, I'll send along the output. It has been a few years since I posted a question to this list so I don't think I have a bookmark for it any more.... Hmmm. I'll look around.
Meanwhile:
--
select
*
from
my_parent_table
where
mypk = 'something';
Uses an index scan on each of my_parent_table's children except for a couple of them that don't have a lot of rows, and those are sequence scanned. (which is ok)
--
select
*
from
some_other_table sot
join my_parent_table mpt on sot.some_column = mpt.mypk
where
sot.another_column = 'q'
Sequence scans each of my_parent_table's children. (It doesn't matter which order I put the join.)
--
select
*
from
some_other_table sot
join my_parent_table mpt on sot.some_column = mpt.mypk
where
mpt.column_3 = 'z'
and
sot.another_column = 'q'
Index scans my_parent_table's children on column_3 (except for the couple with only a few rows), and doesn't sequence scan for the mypk column at all.
On Fri, Dec 11, 2015 at 2:44 PM, Andreas Kretschmer <akretschmer@spamfence.net> wrote:
Rick Otten <rottenwindfish@gmail.com> wrote:
> I'm using PostgreSQL 9.5 Beta 2.
>
> I am working with a partitioned table set.
>
> The first thing I noticed, when creating indexes on the 20 or so partitions,
> was that if I create them too fast they don't all succeed. I have to do a few
> at a time, let them breathe for a few seconds, and then do a few more. I had
> been simply generating all of the create index commands in a text editor, and
> then cutting and pasting the lot of them into psql all at once or running them
> by using psql '-f'. Most would get created, but not all. It seems almost
> random. There were no obvious error messages. When I do a few at a time, it
> is never an issue.
Sure? Have you checked that?
> If I do a simple query with a where clause on a specific column from the parent
> table, I can see it index scan each of the children. This is what I want it to
> do, so no complaints there.
>
> However, if I try to (inner) join another table with that column, the planner
> sequence scans each of the children instead of using the indexes. I saw
> someone had posted a similar question to this list back in January, however I
> didn't see the answer.
Show us the output from explain analyse <your query>
> FWIW, the column in question is a UUID column and is the primary key for each
> of the child tables.
PostgreSQL using a cost-modell, so maybe there are not enough rows in
the table. That's just a guess, you can see that with explain analyse
...
Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect. (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly." (unknown)
Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
On Fri, Dec 11, 2015 at 1:01 PM, Rick Otten <rottenwindfish@gmail.com> wrote: > The first thing I noticed, when creating indexes on the 20 or so partitions, > was that if I create them too fast they don't all succeed. I have to do a > few at a time, let them breathe for a few seconds, and then do a few more. > I had been simply generating all of the create index commands in a text > editor, and then cutting and pasting the lot of them into psql all at once I have seen problems with OS "paste" functionality dropping chunks of pasted text if it was too big. > or running them by using psql '-f'. ... but I would be surprised if that happened when reading from a file. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
> Rick Otten <rottenwindfish@gmail.com> hat am 11. Dezember 2015 um 21:40 > geschrieben: > > > I do not know why if I blast a new index creation on the 20 or so children > all at once some of them fail, but then if I go back and do a few at a time > they all work. It has happened to me 3 times now, so I'm pretty sure I'm > not imagining it. don't believe that, sorry. > > What specifically in the explain analyze output tells you that it is using > a sequence scan instead of an index scan _because_ there are too few rows? > I can see where it chooses a sequence scan over an index and I know there > are only a few rows in those tables, but I'm not sure how the explain > output tells you that it made that choice on purpose. a sequentiell scan over a small table are cheaper than an index-scan. Imageine a small table, only 3 rows. Fits in one page. It's cheaper to read just this page than read the index plus read the table to put out the result row. Why are you using partitioning? That's make only sense with large child-tables (more than 1 million rows or so) and if you have a proper partitioning schema. > > Why would the select statement use the index, but not the join? > > There used to be an explain output anonymizer tool, if I can find that > again, I'll send along the output. It has been a few years since I posted > a question to this list so I don't think I have a bookmark for it any > more.... Hmmm. I'll look around. http://explain.depesz.com/
Ok, here is the first case where I select on the column:
Here is the second case where I try a join:
And here is the third case where I add a filter on the parent table:
The primary use case for partitioning is for performance gains when working with very large tables. I agree these are not that large and by itself it does not justify the extra complexity of working with partitioning.
However there are other use cases for the partitioning model. In our case we have legacy business processes that swap out the child tables and operate on them independently from each other. They could be refactored to work together within one big table, but that is a project for another day. The segmentation of the data into structurally consistent but related separate tables is the first step in that direction. (previously they were all different from each other, but similar too) Some of these children tables will hit 1M rows by the end of 2016, but it will take a while for them to grow to that size.
I do have another table with many millions of rows that could use partitioning, and eventually I'll split that one up - probably around the time I merge this one into a single table. First I have to finish getting everything off of MySQL...
The query performance hit for sequence scanning isn't all that terrible, but I'd rather understand and get rid of the issue if I can, now, before I run into it again in a situation where it is crippling.
Thank you for your help with this!
--
ps: You don't have to believe me about the bulk index adding thing. I hardly believe it myself. It is just something to keep an eye out for. If it is a real issue, I ought to be able to build a reproducible test case to share - at that time I'll see if I can open it up as a real bug. For now I'd rather focus on understanding why my select uses an index and a join won't.
On Fri, Dec 11, 2015 at 4:44 PM, Andreas Kretschmer <andreas@a-kretschmer.de> wrote:
> Rick Otten <rottenwindfish@gmail.com> hat am 11. Dezember 2015 um 21:40
> geschrieben:
>
>
> I do not know why if I blast a new index creation on the 20 or so children
> all at once some of them fail, but then if I go back and do a few at a time
> they all work. It has happened to me 3 times now, so I'm pretty sure I'm
> not imagining it.
don't believe that, sorry.
>
> What specifically in the explain analyze output tells you that it is using
> a sequence scan instead of an index scan _because_ there are too few rows?
> I can see where it chooses a sequence scan over an index and I know there
> are only a few rows in those tables, but I'm not sure how the explain
> output tells you that it made that choice on purpose.
a sequentiell scan over a small table are cheaper than an index-scan. Imageine a
small table,
only 3 rows. Fits in one page. It's cheaper to read just this page than read the
index
plus read the table to put out the result row.
Why are you using partitioning? That's make only sense with large child-tables
(more than 1 million rows or so)
and if you have a proper partitioning schema.
>
> Why would the select statement use the index, but not the join?
>
> There used to be an explain output anonymizer tool, if I can find that
> again, I'll send along the output. It has been a few years since I posted
> a question to this list so I don't think I have a bookmark for it any
> more.... Hmmm. I'll look around.
http://explain.depesz.com/
> Rick Otten <rottenwindfish@gmail.com> hat am 11. Dezember 2015 um 23:09 > geschrieben: > > The query performance hit for sequence scanning isn't all that terrible, > but I'd rather understand and get rid of the issue if I can, now, before I > run into it again in a situation where it is crippling. i think, you should try to understand how the planner works. a simple example: test=# create table foo (id serial primary key, val text); CREATE TABLE test=*# insert into foo (val) select repeat(md5(1::text), 5); INSERT 0 1 test=*# analyse foo; ANALYZE test=*# explain analyse select val from foo where id=1; QUERY PLAN ----------------------------------------------------------------------------------------------- Seq Scan on foo (cost=0.00..1.02 rows=1 width=164) (actual time=0.006..0.007 rows=1 loops=1) Filter: (id = 1) Rows Removed by Filter: 1 Planning time: 0.118 ms Execution time: 0.021 ms (5 rows) As you can see a seq-scan. It's a small table, costs ..1.02. Adding one row: test=*# insert into foo (val) select val from foo; INSERT 0 1 test=*# analyse foo; ANALYZE test=*# explain analyse select val from foo where id=1; QUERY PLAN ----------------------------------------------------------------------------------------------- Seq Scan on foo (cost=0.00..1.02 rows=1 width=164) (actual time=0.006..0.007 rows=1 loops=1) Filter: (id = 1) Rows Removed by Filter: 1 Planning time: 0.118 ms Execution time: 0.021 ms (5 rows) The same plan. Adding 2 rows: test=*# insert into foo (val) select val from foo; INSERT 0 2 test=*# analyse foo; ANALYZE test=*# explain analyse select val from foo where id=1; QUERY PLAN ----------------------------------------------------------------------------------------------- Seq Scan on foo (cost=0.00..1.05 rows=1 width=164) (actual time=0.220..0.277 rows=1 loops=1) Filter: (id = 1) Rows Removed by Filter: 3 Planning time: 0.149 ms Execution time: 0.453 ms (5 rows) The same plan. Adding more rows: test=*# insert into foo (val) select val from foo; INSERT 0 4 test=*# insert into foo (val) select val from foo; INSERT 0 8 test=*# insert into foo (val) select val from foo; INSERT 0 16 test=*# insert into foo (val) select val from foo; INSERT 0 32 test=*# insert into foo (val) select val from foo; INSERT 0 64 test=*# insert into foo (val) select val from foo; INSERT 0 128 test=*# insert into foo (val) select val from foo; INSERT 0 256 test=*# insert into foo (val) select val from foo; INSERT 0 512 test=*# insert into foo (val) select val from foo; INSERT 0 1024 test=*# insert into foo (val) select val from foo; INSERT 0 2048 test=*# insert into foo (val) select val from foo; INSERT 0 4096 test=*# analyse foo; ANALYZE test=*# explain analyse select val from foo where id=1; QUERY PLAN ---------------------------------------------------------------------------------------------------------------- Index Scan using foo_pkey on foo (cost=0.28..8.30 rows=1 width=164) (actual time=0.007..0.008 rows=1 loops=1) Index Cond: (id = 1) Planning time: 0.120 ms Execution time: 0.024 ms (4 rows) We got a new plan! Index-Scan now. We are looking now in pg_class to see how many rows and pages we have: test=*# select relpages, reltuples from pg_class where relname = 'foo'; relpages | reltuples ----------+----------- 200 | 8192 (1 row) How large ist the Index? test=*# select relpages, reltuples from pg_class where relname = 'foo_pkey'; relpages | reltuples ----------+----------- 25 | 8192 (1 row) So, now it's cheaper to read the index and than do an index-scan on the heap to read one record (our where-condition is on the primary key, so only one row expected, one page have to read with random access) It's simple math! If you want to learn more you can find a lot about that via google: https://www.google.de/?gws_rd=ssl#q=explaining+explain
Why does it index scan when I use where, but not when I do a join?
On Fri, Dec 11, 2015 at 7:20 PM, Andreas Kretschmer <andreas@a-kretschmer.de> wrote:
> Rick Otten <rottenwindfish@gmail.com> hat am 11. Dezember 2015 um 23:09
> geschrieben:
>
> The query performance hit for sequence scanning isn't all that terrible,
> but I'd rather understand and get rid of the issue if I can, now, before I
> run into it again in a situation where it is crippling.
i think, you should try to understand how the planner works.
a simple example:
test=# create table foo (id serial primary key, val text);
CREATE TABLE
test=*# insert into foo (val) select repeat(md5(1::text), 5);
INSERT 0 1
test=*# analyse foo;
ANALYZE
test=*# explain analyse select val from foo where id=1;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Seq Scan on foo (cost=0.00..1.02 rows=1 width=164) (actual time=0.006..0.007
rows=1 loops=1)
Filter: (id = 1)
Rows Removed by Filter: 1
Planning time: 0.118 ms
Execution time: 0.021 ms
(5 rows)
As you can see a seq-scan. It's a small table, costs ..1.02.
Adding one row:
test=*# insert into foo (val) select val from foo;
INSERT 0 1
test=*# analyse foo;
ANALYZE
test=*# explain analyse select val from foo where id=1;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Seq Scan on foo (cost=0.00..1.02 rows=1 width=164) (actual time=0.006..0.007
rows=1 loops=1)
Filter: (id = 1)
Rows Removed by Filter: 1
Planning time: 0.118 ms
Execution time: 0.021 ms
(5 rows)
The same plan. Adding 2 rows:
test=*# insert into foo (val) select val from foo;
INSERT 0 2
test=*# analyse foo;
ANALYZE
test=*# explain analyse select val from foo where id=1;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Seq Scan on foo (cost=0.00..1.05 rows=1 width=164) (actual time=0.220..0.277
rows=1 loops=1)
Filter: (id = 1)
Rows Removed by Filter: 3
Planning time: 0.149 ms
Execution time: 0.453 ms
(5 rows)
The same plan. Adding more rows:
test=*# insert into foo (val) select val from foo;
INSERT 0 4
test=*# insert into foo (val) select val from foo;
INSERT 0 8
test=*# insert into foo (val) select val from foo;
INSERT 0 16
test=*# insert into foo (val) select val from foo;
INSERT 0 32
test=*# insert into foo (val) select val from foo;
INSERT 0 64
test=*# insert into foo (val) select val from foo;
INSERT 0 128
test=*# insert into foo (val) select val from foo;
INSERT 0 256
test=*# insert into foo (val) select val from foo;
INSERT 0 512
test=*# insert into foo (val) select val from foo;
INSERT 0 1024
test=*# insert into foo (val) select val from foo;
INSERT 0 2048
test=*# insert into foo (val) select val from foo;
INSERT 0 4096
test=*# analyse foo;
ANALYZE
test=*# explain analyse select val from foo where id=1;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Index Scan using foo_pkey on foo (cost=0.28..8.30 rows=1 width=164) (actual
time=0.007..0.008 rows=1 loops=1)
Index Cond: (id = 1)
Planning time: 0.120 ms
Execution time: 0.024 ms
(4 rows)
We got a new plan! Index-Scan now. We are looking now in pg_class to see how
many rows and pages we have:
test=*# select relpages, reltuples from pg_class where relname = 'foo';
relpages | reltuples
----------+-----------
200 | 8192
(1 row)
How large ist the Index?
test=*# select relpages, reltuples from pg_class where relname = 'foo_pkey';
relpages | reltuples
----------+-----------
25 | 8192
(1 row)
So, now it's cheaper to read the index and than do an index-scan on the heap to
read one record (our where-condition is on the primary key, so only one row
expected, one page have to read with random access)
It's simple math! If you want to learn more you can find a lot about that via
google:
https://www.google.de/?gws_rd=ssl#q=explaining+explain
> Rick Otten <rottenwindfish@gmail.com> hat am 12. Dezember 2015 um 01:55 > geschrieben: > > > Why does it index scan when I use where, but not when I do a join? difficult to say/guess because of anonymized names and not knowing the real query. This one? http://explain.depesz.com/s/1es ? All seqscans are fast, a seqscan isn't evil per se.