Re: Statistics Injection - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Statistics Injection
Date
Msg-id 30138.1467470148@sss.pgh.pa.us
Whole thread Raw
In response to Statistics Injection  (Victor Giannakouris - Salalidis <victorasgs@gmail.com>)
List pgsql-hackers
Victor Giannakouris - Salalidis <victorasgs@gmail.com> writes:
> For some research purposes, I am trying to modify the existing statistics
> of some tables in the catalogs in order to change the execution plan,
> experiment with the EXPLAIN call etc.

> Concretely, what I'd like to do is to create a "fake" table with a schema
> of my choice (that's the easy part) and then modify the
> statistics(particularly, the number of tuples and the number of pages).

> Firstly, I create an empty table (CREATE TABLE newTable(....)) and then I
> update the pg_class table as well (UPDATE pg_class SET relpages = #pages
> WHERE relname='newTable').

> The problem is that, even if I set the reltuples and relpages of my choice,
> when I run the EXPLAIN clause for a query in which the 'newTable'  is
> involved in (e.g. EXPLAIN SELECT * FROM newTable), I get the same cost and
> row estimation.

You can't really do it like that, because the planner always looks at
the true relation size (RelationGetNumberOfBlocks()).  It uses
reltuples/relpages as an estimate of tuple density, not as hard numbers.
The reason for this is to cope with any table growth that may have
occurred since the last VACUUM/ANALYZE.

You could modify the code in plancat.c to change that, or you could
plug into the get_relation_info_hook to tweak the constructed
RelOptInfo before anything is done with it.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Docs, backups, and MS VSS
Next
From: Bruce Momjian
Date:
Subject: Re: Docs, backups, and MS VSS