Re: Trying to understand Stats/Query planner issue - Mailing list pgsql-performance

From Tom Lane
Subject Re: Trying to understand Stats/Query planner issue
Date
Msg-id 6469.1321201042@sss.pgh.pa.us
Whole thread Raw
In response to Trying to understand Stats/Query planner issue  ("Strange, John W" <john.w.strange@jpmorgan.com>)
List pgsql-performance
"Strange, John W" <john.w.strange@jpmorgan.com> writes:
> I have a question on how the analyzer works in this type of scenario.
> We calculate some results and COPY INTO some partitioned tables, which we use some selects to aggregate the data back
outinto reports.  Everyone once in a while the aggregation step picks a bad plan due to stats on the tables that were
justpopulated.   Updating the stats and rerunning the query seems to solve the problem, this only happens if we enable
nestedloop query plans. 

Well, even if auto-analyze launches instantly after you commit the
insertions (which it won't), it's going to take time to scan the table
and then commit the updates to pg_statistic.  So there is always going
to be some window where queries will get planned with obsolete
information.  If you're inserting enough data to materially change the
statistics of a table, and you need to query that table right away,
doing a manual ANALYZE rather than waiting for auto-analyze is
recommended.

> The other option is just to analyze each table involved in the query after the insert, but that seems a bit
counterproductive.

Why would you think that?  This type of scenario is exactly why ANALYZE
isn't deprecated as a user command.

            regards, tom lane

pgsql-performance by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Heavy contgnous load
Next
From: Venkat Balaji
Date:
Subject: Re: : bg_writer overloaded ?