Home > mailing lists

Re: Faster distinct query? - Mailing list pgsql-general

From	Tom Lane
Subject	Re: Faster distinct query?
Date	September 22, 2021 23:48:39
Msg-id	2245342.1632354519@sss.pgh.pa.us Whole thread Raw
In response to	Re: Faster distinct query? (Michael Lewis <mlewis@entrata.com>)
List	pgsql-general

Tree view

Michael Lewis <mlewis@entrata.com> writes:
> On Wed, Sep 22, 2021 at 2:48 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The "index-only" scan is reported to do 86m heap fetches along the
>> way to returning 812m rows, so the data is apparently pretty dirty.

> Do you say that because you would expect many more than 10 tuples per page?

No, I say that because if the table were entirely all-visible, there
would have been *zero* heap fetches.  As it stands, it's reasonable
to suspect that a pretty sizable fraction of the index-only scan's
runtime went into random-access heap fetches made to verify
visibility of individual rows.

(You will, of course, never get to exactly zero heap fetches in an
IOS unless the table data is quite static.  But one dirty page
out of every ten seems like there were a lot of recent changes.
A VACUUM to clean that up might be well worthwhile.)

            regards, tom lane

pgsql-general by date:

From: Michael Lewis
Date: 22 September 2021, 22:51:51
Subject: Re: Faster distinct query?

From: David Rowley
Date: 23 September 2021, 00:58:31
Subject: Re: Faster distinct query?

Re: Faster distinct query? - Mailing list pgsql-general

Previous

Next