Re: Faster distinct query? - Mailing list pgsql-general

From Tom Lane
Subject Re: Faster distinct query?
Date
Msg-id 2245342.1632354519@sss.pgh.pa.us
Whole thread Raw
In response to Re: Faster distinct query?  (Michael Lewis <mlewis@entrata.com>)
List pgsql-general
Michael Lewis <mlewis@entrata.com> writes:
> On Wed, Sep 22, 2021 at 2:48 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The "index-only" scan is reported to do 86m heap fetches along the
>> way to returning 812m rows, so the data is apparently pretty dirty.

> Do you say that because you would expect many more than 10 tuples per page?

No, I say that because if the table were entirely all-visible, there
would have been *zero* heap fetches.  As it stands, it's reasonable
to suspect that a pretty sizable fraction of the index-only scan's
runtime went into random-access heap fetches made to verify
visibility of individual rows.

(You will, of course, never get to exactly zero heap fetches in an
IOS unless the table data is quite static.  But one dirty page
out of every ten seems like there were a lot of recent changes.
A VACUUM to clean that up might be well worthwhile.)

            regards, tom lane



pgsql-general by date:

Previous
From: Michael Lewis
Date:
Subject: Re: Faster distinct query?
Next
From: David Rowley
Date:
Subject: Re: Faster distinct query?