Re: query plan question - Mailing list pgsql-performance

From David Parker
Subject Re: query plan question
Date
Msg-id 07FDEE0ED7455A48AC42AC2070EDFF7C26BAC3@corpsrv2.tazznetworks.com
Whole thread Raw
In response to query plan question  ("David Parker" <dparker@tazznetworks.com>)
List pgsql-performance
What I think is happening with the missing pg_statistic entries:

The install of our application involves a lot of data importing (via
JDBC) in one large transaction, which can take up to 30 minutes. (I
realize I left out this key piece of info in my original post...)

The pg_autovacuum logic is relying on data from pg_stat_all_tables to
make the decision about running analyze. As far as I can tell, the data
in this view gets updated outside of the transaction, because I saw the
numbers growing while I was importing. I saw pg_autovacuum log messages
for running analyze on several tables, but no statistics data showed up
for these, I assume because the actual data in the table wasn't yet
visible to pg_autovacuum because the import transaction had not finished
yet.

When the import finished, not all of the tables affected by the import
were re-visited because they had not bumped up over the threshold again,
even though the analyze run for those tables had not generated any stats
because of the still-open transaction.

Am I making the correct assumptions about the way the various pieces
work? Does this scenario make sense?

It's easy enough for us to kick off a vacuum/analyze at the end of a
long import - but this "mysterious" behavior was bugging me!

Thanks.

- DAP

>-----Original Message-----
>From: Matthew T. O'Connor [mailto:matthew@zeut.net]
>Sent: Wednesday, November 17, 2004 2:02 PM
>To: David Parker
>Cc: Tom Lane; Jeff; Russell Smith; pgsql-performance@postgresql.org
>Subject: Re: [PERFORM] query plan question
>
>Well based on the autovacuum log that you attached, all of
>those tables
>are insert only (at least during the time period included in
>the log.
>Is that correct?  If so, autovacuum will never do a vacuum
>(unless required by xid wraparound issues) on those tables.
>So this doesn't appear to be an autovacuum problem.  I'm not
>sure about the missing pg_statistic entries anyone else care
>to field that one?
>
>Matthew
>
>
>David Parker wrote:
>
>>Thanks. The tables I'm concerned with are named: 'schema', 'usage',
>>'usageparameter', and 'flow'. It looks like autovacuum is performing
>>analyzes:
>>
>>% grep "Performing: " logs/.db.tazz.vacuum.log
>>[2004-11-17 12:05:58 PM] Performing: ANALYZE
>>"public"."scriptlibrary_library"
>>[2004-11-17 12:15:59 PM] Performing: ANALYZE
>>"public"."scriptlibraryparm"
>>[2004-11-17 12:15:59 PM] Performing: ANALYZE "public"."usageparameter"
>>[2004-11-17 12:21:00 PM] Performing: ANALYZE "public"."usageproperty"
>>[2004-11-17 12:21:00 PM] Performing: ANALYZE "public"."route"
>>[2004-11-17 12:21:00 PM] Performing: ANALYZE "public"."usageparameter"
>>[2004-11-17 12:21:00 PM] Performing: ANALYZE
>>"public"."scriptlibrary_library"
>>[2004-11-17 12:26:01 PM] Performing: ANALYZE "public"."usage"
>>[2004-11-17 12:26:01 PM] Performing: ANALYZE "public"."usageparameter"
>>[2004-11-17 12:31:04 PM] Performing: ANALYZE "public"."usageproperty"
>>[2004-11-17 12:36:04 PM] Performing: ANALYZE "public"."route"
>>[2004-11-17 12:36:04 PM] Performing: ANALYZE "public"."service_usage"
>>[2004-11-17 12:36:04 PM] Performing: ANALYZE "public"."usageparameter"
>>
>>But when I run the following:
>>
>>select * from pg_statistic where starelid in (select oid from
>pg_class
>>where relname in
>>('schema','usageparameter','flow','usage'))
>>
>>it returns no records. Shouldn't it? It doesn't appear to be doing a
>>vacuum anywhere, which makes sense because none of these tables have
>>over the default threshold of 1000. Are there statistics
>which only get
>>generated by vacuum?
>>
>>I've attached a gzip of the pg_autovacuum log file, with -d 3.
>>
>>Thanks again.
>>
>>- DAP
>>
>>
>>
>>
>>>-----Original Message-----
>>>From: Matthew T. O'Connor [mailto:matthew@zeut.net]
>>>Sent: Wednesday, November 17, 2004 11:41 AM
>>>To: David Parker
>>>Cc: Tom Lane; Jeff; Russell Smith; pgsql-performance@postgresql.org
>>>Subject: Re: [PERFORM] query plan question
>>>
>>>David Parker wrote:
>>>
>>>
>>>
>>>>We're using postgresql 7.4.5. I've only recently put
>pg_autovacuum in
>>>>place as part of our installation, and I'm basically taking the
>>>>defaults. I doubt it's a problem with autovacuum itself, but rather
>>>>with my configuration of it. I have some reading to do, so
>>>>
>>>>
>>>any pointers
>>>
>>>
>>>>to existing autovacuum threads would be greatly appreciated!
>>>>
>>>>
>>>>
>>>Well the first thing to do is increase the verbosity of the
>>>pg_autovacuum logging output.  If you use -d2 or higher,
>pg_autovacuum
>>>will print out a lot of detail on what it thinks the thresholds are
>>>and
>>>why it is or isn't performing vacuums and analyzes.   Attach
>>>some of the
>>>log and I'll take a look at it.
>>>
>>>
>>>
>
>

pgsql-performance by date:

Previous
From: "Andrew Janian"
Date:
Subject: Re: Query Performance and IOWait
Next
From: "Joshua D. Drake"
Date:
Subject: Re: Query Performance and IOWait