Re: PostgreSQL slow after VACUUM - Mailing list pgsql-general
From | Pierre-Frédéric Caillaud |
---|---|
Subject | Re: PostgreSQL slow after VACUUM |
Date | |
Msg-id | opsh2o3mtacq72hf@musicbox Whole thread Raw |
In response to | Re: PostgreSQL slow after VACUUM (Arjen van der Meijden <acmmailing@vulcanus.its.tudelft.nl>) |
Responses |
Re: PostgreSQL slow after VACUUM
|
List | pgsql-general |
It seems this issue has been mentionned several times lately... I'd propose something to do to avoid it : * When TRUNCAT'ing a table : - the row-count is reset to 0 (of course !) - however, the column stats are kept, on the basis that the data which will be inserted later in the table will most likely have at least some correlation to the data that was in it before. Of course, VACUUM ANALYZE will update the statistics with real ones when it is run. This would make TRUNCATE behave more like DELETE FROM... * When INSERT'ing into a table : The OP's problem was that the query is planned as if the table was almost empty (hence seq scan), well it is in fact empty, but many rows are inserted. Of course, the query can't be replanned in the middle of its execution (although that'd be a nice feature to ad on the withlist for postgres version 2020)... When planning for inserts, the planner currently uses the estimated number of rows in the target table, taken from the statistics. However, at this time, the planner has an estimation of how many rows it's going to insert : - INSERT ... SELECT, or CREATE TABLE ... AS SELECT : The SELECT from which the rows will come is planned, so there should be an estimated number of rows readily available. - COPY FROM... The number of rows could be estimated by peeking at the first N rows in the file (not on COPY FROM stdin though), or an additional parameter to COPY could be give,. Then, the INSERT can be planned as if the target table contained (estimated number of rows in the target table) + (estimated number of rows to be inserted), and it will choose a good plan for populating empty tables... What do you think ? > > The problem is, that while doing the vacuum full ANALYZE the table was > empty. It therefore gathered statistics of a situation which isn't there > anymore when you fill up the table. In an empty or small table, it is > normal to do sequential scans. Which you most of the time don't want in > a large table. > My suggestion is to VACUUM (FULL) the table after you've deleted the > data. Then fill up the table and do a ANALYZE when you're done filling > it. > > That way, the analysis of the data will be much more accurate. My guess > is, it'll use indexes much sooner and be much faster. Have a look at the > output of EXPLAIN ANALYZE yourstatement; before doing ANALYZE and after. > > Best regards, > > Arjen
pgsql-general by date: