Re: Does auto-analyze work on dirty writes? (was: Re: [HACKERS] Slow count(*) again...) - Mailing list pgsql-performance

From Tom Lane
Subject Re: Does auto-analyze work on dirty writes? (was: Re: [HACKERS] Slow count(*) again...)
Date
Msg-id 22836.1296834064@sss.pgh.pa.us
Whole thread Raw
In response to Does auto-analyze work on dirty writes? (was: Re: [HACKERS] Slow count(*) again...)  (Mark Mielke <mark@mark.mielke.cc>)
List pgsql-performance
Mark Mielke <mark@mark.mielke.cc> writes:
> My understanding is:

> 1) Background daemon wakes up and checks whether a number of changes
> have happened to the database, irrelevant of transaction boundaries.

> 2) Background daemon analyzes a percentage of rows in the database for
> statistical data, irrelevant of row visibility.

> 3) Analyze is important for both visible rows and invisible rows, as
> plan execution is impacted by invisible rows. As long as they are part
> of the table, they may impact the queries performed against the table.

> 4) It doesn't matter if the invisible rows are invisible because they
> are not yet committed, or because they are not yet vacuumed.

> Would somebody in the know please confirm the above understanding for my
> own piece of mind?

No.

1. Autovacuum fires when the stats collector's insert/update/delete
counts have reached appropriate thresholds.  Those counts are
accumulated from messages sent by backends at transaction commit or
rollback, so they take no account of what's been done by transactions
still in progress.

2. Only live rows are included in the stats computed by ANALYZE.
(IIRC it uses SnapshotNow to decide whether rows are live.)

Although the stats collector does track an estimate of the number of
dead rows for the benefit of autovacuum, this isn't used by planning.
Table bloat is accounted for only in terms of growth of the physical
size of the table in blocks.

            regards, tom lane

pgsql-performance by date:

Previous
From: Ivan Voras
Date:
Subject: Re: Query performance with disabled hashjoin and mergejoin
Next
From: Tom Lane
Date:
Subject: Re: Different execution plans for semantically equivalent queries