Re: Performance regression with PostgreSQL 11 and partitioning - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Re: Performance regression with PostgreSQL 11 and partitioning |
Date | |
Msg-id | CAKJS1f8qkcwr2DULd+04rBmubHkKzp4abuFykgoPUsVM-4-38g@mail.gmail.com Whole thread Raw |
In response to | Re: Performance regression with PostgreSQL 11 and partitioning (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Performance regression with PostgreSQL 11 and partitioning
Re: Performance regression with PostgreSQL 11 and partitioning |
List | pgsql-hackers |
On 12 June 2018 at 01:49, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Jun 8, 2018 at 3:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> That being said, I don't mind a bit if you want to look for further >>> speedups here, but if you do, keep in mind that a lot of queries won't >>> even use partition-wise join, so all of the arrays will be of length >>> 1. Even when partition-wise join is used, it is quite likely not to >>> be used for every table in the query, in which case it will still be >>> of length 1 in some cases. So pessimizing nappinfos = 1 even slightly >>> is probably a regression overall. >> >> TBH, I am way more concerned about the pessimization introduced for >> every pre-existing usage of these functions by putting search loops >> into them at all. I'd like very much to revert that. If we can >> replace those with something along the line of root->index_array[varno] >> we'll be better off across the board. > > I think David's analysis is correct -- this doesn't quite work. We're > trying to identify whether a given varno is one of the ones to be > translated, and it's hard to come up with a faster solution than > iterating over a (very short) array of those values. One thing we > could do is have two versions of each function, or else an optimized > path for the very common nappinfos=1 case. I'm just not sure it would > be worthwhile. Traversing a short array isn't free, but it's pretty > cheap. So this is the current situation as far as I see it: We could go and add a new version of adjust_appendrel_attrs() and adjust_appendrel_attrs_mutator() that accept a Relids for the child rels rather than an array of AppendRelInfos. However, that's quite a lot of code duplication. We could perhaps cut down on duplication by having a callback function stored inside of adjust_appendrel_attrs_context which searches for the correct AppendRelInfo to use. However, it does not seem to be inline with simplifying the code. We've not yet heard back from Tom with more details about his root->index_array[varno] idea. I can't quite see how this is possible and for the moment I assume Tom misunderstood that it's the parent varno that's known, not the varno of the child. I've attached a patch which cleans up my earlier version and moves the setup of the append_rel_array into its own function instead of sneaking code into setup_simple_rel_arrays(). I've also now updated the comment above find_childrel_appendrelinfo(), which is now an unused function. I tried the following test case: CREATE TABLE partbench (date TIMESTAMP NOT NULL, i1 INT NOT NULL, i2 INT NOT NULL, i3 INT NOT NULL, i4 INT NOT NULL, i5 INT NOT NULL) PARTITION BY RANGE (date); \o /dev/null select 'CREATE TABLE partbench' || x::text || ' PARTITION OF partbench FOR VALUES FROM (''' || '2017-03-06'::date + (x::text || ' hours')::interval || ''') TO (''' || '2017-03-06'::date + ((x+1)::text || ' hours')::interval || ''');' from generate_Series(0,9999) x; \gexec \o SELECT * FROM partbench WHERE date = '2018-04-23 00:00:00'; Patched Time: 190.989 ms Time: 187.313 ms Time: 190.239 ms Unpatched Time: 775.771 ms Time: 781.631 ms Time: 762.777 ms Is this good enough for v11? -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
pgsql-hackers by date: