Thread: autovacuum and temp tables support
Hi there, our client complained about slow query, which involves temporary tables. Analyzing them manually solved the problem. I don't remember arguments against temporary tables support by autovacuum. I'd appreciate any pointers. Also, it's worth to add autovacuum_enable_temp_tables variable to control autovacuum behaviour ? Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
Oleg Bartunov wrote: > our client complained about slow query, which involves temporary tables. > Analyzing them manually solved the problem. I don't remember > arguments against temporary tables support by autovacuum. I'd > appreciate any > pointers. Autovacuum can't process temp tables; they could reside in a backend's private temp buffers (local memory, not shared). -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Thu, Apr 8, 2010 at 2:53 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Oleg Bartunov wrote: > >> our client complained about slow query, which involves temporary tables. >> Analyzing them manually solved the problem. I don't remember >> arguments against temporary tables support by autovacuum. I'd >> appreciate any >> pointers. > > Autovacuum can't process temp tables; they could reside in a backend's > private temp buffers (local memory, not shared). On general thought I've had is that it would be nice if the first attempt to SELECT against a table with no statistics would trigger an automatic ANALYZE by the backend on which the query was executed. It's pretty common to populate a table using INSERT, or CTAS, or COPY and then try to immediately run a query against it, and I've often found that it's necessary to insert manual analyze statements in there to get decent query plans. ...Robert
Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Apr 8, 2010 at 2:53 PM, Alvaro Herrera >> Autovacuum can't process temp tables; they could reside in a >> backend's private temp buffers (local memory, not shared). > > it would be nice if the first attempt to SELECT against a table > with no statistics would trigger an automatic ANALYZE by the > backend on which the query was executed. +1 as an RFE -Kevin
On Thu, 8 Apr 2010, Robert Haas wrote: > On Thu, Apr 8, 2010 at 2:53 PM, Alvaro Herrera > <alvherre@commandprompt.com> wrote: >> Oleg Bartunov wrote: >> >>> our client complained about slow query, which involves temporary tables. >>> Analyzing them manually solved the problem. I don't remember >>> arguments against temporary tables support by autovacuum. I'd >>> appreciate any >>> pointers. >> >> Autovacuum can't process temp tables; they could reside in a backend's >> private temp buffers (local memory, not shared). > > On general thought I've had is that it would be nice if the first > attempt to SELECT against a table with no statistics would trigger an > automatic ANALYZE by the backend on which the query was executed. > It's pretty common to populate a table using INSERT, or CTAS, or COPY > and then try to immediately run a query against it, and I've often > found that it's necessary to insert manual analyze statements in there > to get decent query plans. Oracle does this. So, is't worth to add support (configurable, like Oracle's optimizer_dynamic_sampling) ? Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
On Thu, Apr 8, 2010 at 3:22 PM, Oleg Bartunov <oleg@sai.msu.su> wrote: >> On general thought I've had is that it would be nice if the first >> attempt to SELECT against a table with no statistics would trigger an >> automatic ANALYZE by the backend on which the query was executed. >> It's pretty common to populate a table using INSERT, or CTAS, or COPY >> and then try to immediately run a query against it, and I've often >> found that it's necessary to insert manual analyze statements in there >> to get decent query plans. > > Oracle does this. So, is't worth to add support (configurable, like > Oracle's optimizer_dynamic_sampling) ? Well, dynamic sampling is considerably more complicated than what I proposed, which is just to force an ordinary ANALYZE before the first query against the table. It would be a very powerful feature if we could use it to ameliorate, for example, the gross statistical errors that sometimes occur when multiple, correlated filter conditions are applied to the same base table; but I don't think it's in the direct path of solving the present complaint. ...Robert
Robert Haas wrote: > On Thu, Apr 8, 2010 at 3:22 PM, Oleg Bartunov <oleg@sai.msu.su> wrote: > >> On general thought I've had is that it would be nice if the first > >> attempt to SELECT against a table with no statistics would trigger an > >> automatic ANALYZE by the backend on which the query was executed. > >> It's pretty common to populate a table using INSERT, or CTAS, or COPY > >> and then try to immediately run a query against it, and I've often > >> found that it's necessary to insert manual analyze statements in there > >> to get decent query plans. > > > > Oracle does this. So, is't worth to add support (configurable, like > > Oracle's optimizer_dynamic_sampling) ? > > Well, dynamic sampling is considerably more complicated than what I > proposed, which is just to force an ordinary ANALYZE before the first > query against the table. It would be a very powerful feature if we > could use it to ameliorate, for example, the gross statistical errors > that sometimes occur when multiple, correlated filter conditions are > applied to the same base table; but I don't think it's in the direct > path of solving the present complaint. I have added this TODO: Consider analyzing temporary tables when they are first used in a query Autovacuum cannot analyze or vacuum temporary tables. * http://archives.postgresql.org/pgsql-hackers/2010-04/msg00416.php I have also applied the following documentation patch to document this behavior. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com Index: doc/src/sgml/maintenance.sgml =================================================================== RCS file: /cvsroot/pgsql/doc/src/sgml/maintenance.sgml,v retrieving revision 1.102 diff -c -c -r1.102 maintenance.sgml *** doc/src/sgml/maintenance.sgml 3 Apr 2010 07:22:55 -0000 1.102 --- doc/src/sgml/maintenance.sgml 16 Apr 2010 02:21:23 -0000 *************** *** 643,648 **** --- 643,654 ---- </para> <para> + Temporary tables cannot be accessed by autovacuum. Therefore, + appropriate vacuum and analyze operations should be performed via + session SQL commands. + </para> + + <para> The default thresholds and scale factors are taken from <filename>postgresql.conf</filename>, but it is possible to override them on a table-by-table basis; see Index: doc/src/sgml/ref/create_table.sgml =================================================================== RCS file: /cvsroot/pgsql/doc/src/sgml/ref/create_table.sgml,v retrieving revision 1.125 diff -c -c -r1.125 create_table.sgml *** doc/src/sgml/ref/create_table.sgml 3 Apr 2010 07:23:00 -0000 1.125 --- doc/src/sgml/ref/create_table.sgml 16 Apr 2010 02:21:27 -0000 *************** *** 145,150 **** --- 145,159 ---- </para> <para> + The <link linkend="autovacuum">autovacuum daemon</link> cannot + access and therefore cannot vacuum or analyze temporary tables. + For this reason, appropriate vacuum and analyze operations should be + performed via session SQL commands. For example, if a temporary + table is going to be used in complex queries, it is wise to run + <command>ANALYZE</> on the temporary table after it is populated. + </para> + + <para> Optionally, <literal>GLOBAL</literal> or <literal>LOCAL</literal> can be written before <literal>TEMPORARY</> or <literal>TEMP</>. This makes no difference in <productname>PostgreSQL</>, but see