Thread: Temp table's effect on performance
I'd like to understand better why manually using a temp table can improve performance so much. I had one complicated query that performed well. I replaced a table in it with a reference to a view, which was really just the table with an inner join, and performance worsened by 2000x. Literally. I then modified it to first manually SELECT the view into a temp table, and performance returned to close to the original query. The temp table had the same indexes as the original one. How is that? What does the temp table do that the planner can't do itself? Don't planner uses temp structures too? In other words: Since my query is 100% identical algebraicly to not using a temp table, why is it so much faster? Why can't the planner work in the exact same order?
Hello 2013/1/18 Robert James <srobertjames@gmail.com>: > I'd like to understand better why manually using a temp table can > improve performance so much. one possible effect - there should be different statistic did you look on EXPLAIN ANALYZE? Regards Pavel Stehule > > I had one complicated query that performed well. I replaced a table > in it with a reference to a view, which was really just the table with > an inner join, and performance worsened by 2000x. Literally. > > I then modified it to first manually SELECT the view into a temp > table, and performance returned to close to the original query. The > temp table had the same indexes as the original one. > > How is that? What does the temp table do that the planner can't do > itself? Don't planner uses temp structures too? > > In other words: Since my query is 100% identical algebraicly to not > using a temp table, why is it so much faster? Why can't the planner > work in the exact same order? > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
On Fri, Jan 18, 2013 at 9:29 AM, Robert James <srobertjames@gmail.com> wrote: > I'd like to understand better why manually using a temp table can > improve performance so much. > > I had one complicated query that performed well. I replaced a table > in it with a reference to a view, which was really just the table with > an inner join, and performance worsened by 2000x. Literally. I don't see how this can be answered at a general level. Something very specific happened. Without knowing what that specific thing is, what can we say about it? > I then modified it to first manually SELECT the view into a temp > table, and performance returned to close to the original query. The > temp table had the same indexes as the original one. > > How is that? What does the temp table do that the planner can't do > itself? Don't planner uses temp structures too? Most likely the temp table just got lucky. If you re-wrote hundreds of other queries to use a temp table, would all of them systematically get faster? If so, that would certainly be a thing to wonder about. But a single example, with no details....well, what can be said about that? When you choose how to make the temp table, you are probably imposing some intuitive knowledge you have (but the planner does not) about the nature of your data. > In other words: Since my query is 100% identical algebraicly to not > using a temp table, why is it so much faster? Why can't the planner > work in the exact same order? Unless you are doing ANALYZE on your temp table, then the planner has to make some guesses about the size and selectivity and correlations involved. Those guesses probably just got lucky at being better in this particular case than the real statistics. Without having any details, it is hard to say why. It is easy to spot the one time the planner gets it wrong, while the 9,999 times it got it right go unnoticed. Cheers, Jeff
Jeff Janes <jeff.janes@gmail.com> writes: > On Fri, Jan 18, 2013 at 9:29 AM, Robert James <srobertjames@gmail.com> wrote: >> In other words: Since my query is 100% identical algebraicly to not >> using a temp table, why is it so much faster? Why can't the planner >> work in the exact same order? > Unless you are doing ANALYZE on your temp table, then the planner has > to make some guesses about the size and selectivity and correlations > involved. Those guesses probably just got lucky at being better in > this particular case than the real statistics. Whether you've done ANALYZE or not, the planner can see the physical size of the temp table, which allows it to make a rowcount estimate based on a guess as to the average row width (which it can make, in a pretty squishy way, given only the column datatypes). Now an estimate gotten that way can be pretty far off, but it might still be much better than what we can come up with for a sub-select (view). Of course if you *have* done an ANALYZE on the temp table then the planner is far better informed than when considering a view. Whether that's the explanation is of course impossible to know from the given (lack of) information. regards, tom lane
On 1/18/13, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Jeff Janes <jeff.janes@gmail.com> writes: >> On Fri, Jan 18, 2013 at 9:29 AM, Robert James <srobertjames@gmail.com> >> wrote: >>> In other words: Since my query is 100% identical algebraicly to not >>> using a temp table, why is it so much faster? Why can't the planner >>> work in the exact same order? > >> Unless you are doing ANALYZE on your temp table, then the planner has >> to make some guesses about the size and selectivity and correlations >> involved. Those guesses probably just got lucky at being better in >> this particular case than the real statistics. > > Whether you've done ANALYZE or not, the planner can see the physical > size of the temp table, which allows it to make a rowcount estimate > based on a guess as to the average row width (which it can make, in > a pretty squishy way, given only the column datatypes). Now an > estimate gotten that way can be pretty far off, but it might still > be much better than what we can come up with for a sub-select (view). > Of course if you *have* done an ANALYZE on the temp table then the > planner is far better informed than when considering a view. > > Whether that's the explanation is of course impossible to know from > the given (lack of) information. > > regards, tom lane > What information would be helpful to post?
Robert James <srobertjames@gmail.com> writes: > On 1/18/13, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Whether that's the explanation is of course impossible to know from >> the given (lack of) information. > What information would be helpful to post? Both forms of the query, EXPLAIN ANALYZE output for both, along with the underlying table definitions. regards, tom lane
Robert James wrote: > What information would be helpful to post? That question comes up so often we have a page to help answer it. :-) http://wiki.postgresql.org/wiki/SlowQueryQuestions -Kevin