Re: VACUUM and ANALYZE Follow-Up - Mailing list pgsql-general

From Pierre-Frédéric Caillaud
Subject Re: VACUUM and ANALYZE Follow-Up
Date
Msg-id opsh86mqz3cq72hf@musicbox
Whole thread Raw
In response to Re: VACUUM and ANALYZE Follow-Up  ("Mark Dexter" <MDEXTER@dexterchaney.com>)
List pgsql-general
> 4. Isn't ANALYZE on a totally empty table really a special case?  The
> presumption should be that the table will not remain empty.  To optimize
> the performance assuming that there will be zero (or close to zero) rows
> seems somewhat pointless.  However, there are valid reasons why a table
> might be empty at the moment in time when the ANALYZE is run.  (In our
> case, we use "work" tables that get cleared at the end of an application
> process.)  And, as mentioned above, it is easier to VACUUM ANALYZE an
> entire database than it is to list tables individually.

    Well, for instance I have a few tables which contain just a few rows, for
instance a list of states in which an object in another table may be, or a
list of tax rates... for these kinds of tables with like, 10 rows, or just
a few pages, you don't want index scans, so VACUUM and ANALYZE are doing
their job.

    If you were going to insert 5 rows in an empty table, you would also want
this behaviour. The problems start when you make a large INSERT in an
empty or almost empty table.

    So, how to solve your problem without slowing the other requests (ie.
selecting and inserting a few rows into a very small table) ?

    Nobody responded to my suggestion that the planner take into account the
number of rows to be inserted into the table in its plan... so I'll repost
it :
    - INSERT ... SELECT :
    Planner has an estimate of how many rows the SELECT will yield. So it
could plan the queries involving SELECTs on the target table (like, UNIQUE
checks et al) using the number of rows in the table + number of rows to be
inserted. This solves your problem.

    Problems with this approach :
    - This only gives a number of rows, not more precise statistics
    It's the only information available so why not use it ? And it's enough
to solve the OP's problem.

    - Can get recursive
    What if there is a self-join ? I guess, just fall back to the previous
behaviour...

    - Does not work for COPY
    argument : COPY should act like it's going to insert many rows. Most of
the time, that's how it's used.

    - When the estimated number of rows to insert is imprecise
    (for instance a SELECT with UNION's or DISTINCT or a huge join), the
outcome would be incertain.

    What do you think ?








pgsql-general by date:

Previous
From: Mage
Date:
Subject: Re: VACUUM and ANALYZE Follow-Up
Next
From: "Joshua D. Drake"
Date:
Subject: Re: VACUUM and ANALYZE Follow-Up