Re: Queue table that quickly grows causes query planner to choose poor plan - Mailing list pgsql-performance

From Tom Lane
Subject Re: Queue table that quickly grows causes query planner to choose poor plan
Date
Msg-id 24443.1530124053@sss.pgh.pa.us
Whole thread Raw
In response to Queue table that quickly grows causes query planner to choose poor plan  (David Wheeler <dwheeler@dgitsystems.com>)
Responses Re: Queue table that quickly grows causes query planner to choose poor plan
List pgsql-performance
David Wheeler <dwheeler@dgitsystems.com> writes:
> I'm having performance trouble with a particular set of queries. It goes a bit like this

> 1) queue table is initially empty, and very narrow (1 bigint column)
> 2) we insert ~30 million rows into queue table
> 3) we do a join with queue table to delete from another table (delete from a using queue where a.id<http://a.id> =
queue.id<http://queue.id>),but postgres stats say that queue table is empty, so it uses a nested loop over all 30
millionrows, taking forever 

Although there's no way to have any useful pg_statistic stats if you won't
do an ANALYZE, the planner nonetheless can see the table's current
physical size, and what it normally does is to multiply the last-reported
tuple density (reltuples/relpages) by the current size.  So if you're
getting an "empty table" estimate anyway, I have to suppose that the
table's state involves reltuples = 0 and relpages > 0.  That's not a
good place to be in; it constrains the planner to believe that the table
is in fact devoid of tuples, because that's what the last ANALYZE saw.

Now, the initial state for a freshly-created or freshly-truncated table
is *not* that.  It is reltuples = 0 and relpages = 0, representing an
undefined tuple density.  Given that, the planner will make some guess
about average tuple size --- which is likely to be a very good guess,
for a table with only fixed-width columns --- and then compute a rowcount
estimate using that plus the observed physical size.

So I think your problem comes from oscillating between really-empty
and not-at-all-empty, and not using an idiomatic way of going back
to the empty state.  Have you tried using TRUNCATE instead of DELETE?

> This queue table is empty 99% of the time, and the query in 3 runs immediately after step 2. Is there any likelyhood
thattweaking the autoanalyze params would help in this case? I don't want to explicitly analyze the table between steps
2and three either as there are other patterns of use where for example 0 rows are inserted in step 2 and this is
expectedto run very very quickly. Do I have any other options? 

I am not following your aversion to sticking an ANALYZE in there,
either.  It's not like inserting 30 million rows would be free.

            regards, tom lane


pgsql-performance by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Queue table that quickly grows causes query planner to choosepoor plan
Next
From: David Wheeler
Date:
Subject: Re: Queue table that quickly grows causes query planner to choose poor plan