Re: Loose Index Scans by Planner? - Mailing list pgsql-performance

From Kevin Grittner
Subject Re: Loose Index Scans by Planner?
Date
Msg-id 503792790200002500049B56@gw.wicourts.gov
Whole thread Raw
In response to Loose Index Scans by Planner?  (Shaun Thomas <sthomas@optionshouse.com>)
Responses Re: Loose Index Scans by Planner?
List pgsql-performance
Shaun Thomas <sthomas@optionshouse.com> wrote:

> So, we know we have a way of doing a loose index scan with CTEs:
>
> http://wiki.postgresql.org/wiki/Loose_indexscan

I tried this on a table in production with 23 million rows for a
column with 45 distinct values which is the high-order column of a
four-column index.  This ran in 445 ms first time and 2 ms on the
second and subsequent tries.  The equivalent SELECT DISTINCT ran in
30 seconds first time, and got down to 11.5 seconds after a few
runs.  So roughly two orders of magnitude faster with a cold cache
and three orders of magnitude faster with a warm cache.

That sure would be a nice optimization to have in the planner.

> But that got me wondering. The planner knows from pg_stats that
> col1 could have low cardinality. If that's the case, and a WHERE
> clause uses a two column index, and col2 is specified, why can't
> it walk each individual bucket in the two-column index, and use
> col2? So I forced such a beast with a CTE:
>
> WITH RECURSIVE t AS (
>    SELECT min(col1) AS col1
>      FROM tablename
>    UNION ALL
>    SELECT (SELECT min(col1)
>              FROM tablename
>             WHERE col1 > t.col1)
>      FROM t
>     WHERE t.col1 IS NOT NULL
> )
> SELECT p.*
>    FROM t
>    JOIN tablename p USING (col1)
>   where p.col2 = 12345
>
> I ask, because while the long-term fix would be to re-order the
> index to (col2, col1), this seems like a situation the planner
> could easily detect and compensate for. In our particular example,
> execution time went from 160ms to 2ms with the CTE rewrite.

Well, that'd be the icing on the cake.  I'd be overjoyed to get the
cake.  :-)

-Kevin


pgsql-performance by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: NOTIFY performance
Next
From: Artur Zając
Date:
Subject: Re: NOTIFY performance