Thread: query plan question

query plan question

From

"David Parker"

Date:

17 November 2004, 06:55:18

I have a query for which postgres is generating a different plan on different machines. The database schema is the
same,the dataset is the same, the configuration is the same (e.g., pg_autovacuum running in both cases), both systems
areSolaris 9. The main difference in the two systems is that one is sparc and the other is intel. 

The query runs in about 40 ms on the intel box, but takes about 18 seconds on the sparc box. Now, the intel boxes we
haveare certainly faster, but I'm curious why the query plan might be different. 

For the intel:

QUERY PLAN
Unique  (cost=11.50..11.52 rows=2 width=131)
  ->  Sort  (cost=11.50..11.50 rows=2 width=131)
        Sort Key: up.prefix, s.name, s.tuid, s.foundryversion
        ->  Hash Join  (cost=10.42..11.49 rows=2 width=131)
              Hash Cond: ("outer".dbid = "inner"."schema")
              ->  Seq Scan on "schema" s  (cost=0.00..1.02 rows=2 width=128)
              ->  Hash  (cost=10.41..10.41 rows=4 width=11)
                    ->  Nested Loop  (cost=0.00..10.41 rows=4 width=11)
                          ->  Nested Loop  (cost=0.00..2.14 rows=4 width=4)
                                ->  Seq Scan on flow fl  (cost=0.00..0.00 rows=1 width=4)
                                      Filter: (servicetype = 646)
                                ->  Index Scan using usage_flow_i on "usage" u  (cost=0.00..2.06 rows=6 width=8)
                                      Index Cond: (u.flow = "outer".dbid)
                          ->  Index Scan using usageparameter_usage_i on usageparameter up  (cost=0.00..2.06 rows=1
width=15)
                                Index Cond: (up."usage" = "outer".dbid)
                                Filter: ((prefix)::text <> 'xsd'::text)

For the sparc:

QUERY PLAN
Unique  (cost=10.81..10.83 rows=1 width=167)
  ->  Sort  (cost=10.81..10.82 rows=1 width=167)
        Sort Key: up.prefix, s.name, s.tuid, s.foundryversion
        ->  Nested Loop  (cost=9.75..10.80 rows=1 width=167)
              Join Filter: ("outer".flow = "inner".dbid)
              ->  Hash Join  (cost=9.75..10.79 rows=1 width=171)
                    Hash Cond: ("outer".dbid = "inner"."schema")
                    ->  Seq Scan on "schema" s  (cost=0.00..1.02 rows=2 width=128)
                    ->  Hash  (cost=9.75..9.75 rows=1 width=51)
                          ->  Nested Loop  (cost=0.00..9.75 rows=1 width=51)
                                Join Filter: ("inner"."usage" = "outer".dbid)
                                ->  Index Scan using usage_flow_i on "usage" u  (cost=0.00..4.78 rows=1 width=8)
                                ->  Index Scan using usageparameter_schema_i on usageparameter up  (cost=0.00..4.96
rows=1width=51) 
                                      Filter: ((prefix)::text <> 'xsd'::text)
              ->  Seq Scan on flow fl  (cost=0.00..0.00 rows=1 width=4)
                    Filter: (servicetype = 646)

I assume the problem with the second plan starts with doing a Nested Loop rather than a Hash Join at the 4th line of
theplan, but I don't know why it would be different for the same schema, same dataset. 

What factors go into the planner's decision to choose a nested loop over a hash join? Should I be looking at adjusting
myruntime configuration on the sparc box somehow? 

Thanks.

- DAP
----------------------------------------------------------------------------------
David Parker    Tazz Networks    (401) 709-5130

Re: query plan question

From

Russell Smith

Date:

17 November 2004, 07:36:08

On Wed, 17 Nov 2004 02:54 pm, you wrote:
> I have a query for which postgres is generating a different plan on different machines. The database schema is the
same,the dataset is the same, the configuration is the same (e.g., pg_autovacuum running in both cases), both systems
areSolaris 9. The main difference in the two systems is that one is sparc and the other is intel. 
>
> The query runs in about 40 ms on the intel box, but takes about 18 seconds on the sparc box. Now, the intel boxes we
haveare certainly faster, but I'm curious why the query plan might be different. 
>
> For the intel:
>
> QUERY PLAN
> Unique  (cost=11.50..11.52 rows=2 width=131)
>   ->  Sort  (cost=11.50..11.50 rows=2 width=131)
>         Sort Key: up.prefix, s.name, s.tuid, s.foundryversion
>         ->  Hash Join  (cost=10.42..11.49 rows=2 width=131)
>               Hash Cond: ("outer".dbid = "inner"."schema")
>               ->  Seq Scan on "schema" s  (cost=0.00..1.02 rows=2 width=128)
>               ->  Hash  (cost=10.41..10.41 rows=4 width=11)
>                     ->  Nested Loop  (cost=0.00..10.41 rows=4 width=11)
>                           ->  Nested Loop  (cost=0.00..2.14 rows=4 width=4)
>                                 ->  Seq Scan on flow fl  (cost=0.00..0.00 rows=1 width=4)
>                                       Filter: (servicetype = 646)
>                                 ->  Index Scan using usage_flow_i on "usage" u  (cost=0.00..2.06 rows=6 width=8)
>                                       Index Cond: (u.flow = "outer".dbid)
>                           ->  Index Scan using usageparameter_usage_i on usageparameter up  (cost=0.00..2.06 rows=1
width=15)
>                                 Index Cond: (up."usage" = "outer".dbid)
>                                 Filter: ((prefix)::text <> 'xsd'::text)
>
> For the sparc:
>
> QUERY PLAN
> Unique  (cost=10.81..10.83 rows=1 width=167)
>   ->  Sort  (cost=10.81..10.82 rows=1 width=167)
>         Sort Key: up.prefix, s.name, s.tuid, s.foundryversion
>         ->  Nested Loop  (cost=9.75..10.80 rows=1 width=167)
>               Join Filter: ("outer".flow = "inner".dbid)
>               ->  Hash Join  (cost=9.75..10.79 rows=1 width=171)
>                     Hash Cond: ("outer".dbid = "inner"."schema")
>                     ->  Seq Scan on "schema" s  (cost=0.00..1.02 rows=2 width=128)
>                     ->  Hash  (cost=9.75..9.75 rows=1 width=51)
>                           ->  Nested Loop  (cost=0.00..9.75 rows=1 width=51)
>                                 Join Filter: ("inner"."usage" = "outer".dbid)
>                                 ->  Index Scan using usage_flow_i on "usage" u  (cost=0.00..4.78 rows=1 width=8)
>                                 ->  Index Scan using usageparameter_schema_i on usageparameter up  (cost=0.00..4.96
rows=1width=51) 
>                                       Filter: ((prefix)::text <> 'xsd'::text)
>               ->  Seq Scan on flow fl  (cost=0.00..0.00 rows=1 width=4)
>                     Filter: (servicetype = 646)
>
Unique  (cost=11.50..11.52 rows=2 width=131)
Unique  (cost=10.81..10.83 rows=1 width=167)

The estimations for the cost is basically the same, 10ms for the first row.  Can you supply Explain analyze to see what
it'sactually doing? 

Russell Smith

Re: query plan question

From

"Joshua D. Drake"

Date:

17 November 2004, 09:41:06

David Parker wrote:

>I have a query for which postgres is generating a different plan on different machines. The database schema is the
same,the dataset is the same, the configuration is the same (e.g., pg_autovacuum running in both cases), both systems
areSolaris 9. The main difference in the two systems is that one is sparc and the other is intel. 
>
>The query runs in about 40 ms on the intel box, but takes about 18 seconds on the sparc box. Now, the intel boxes we
haveare certainly faster, but I'm curious why the query plan might be different. 
>
>

If they are the same and PostgreSQL are the same, are the intel machines
Xeons?

Sincerely,

Joshua D. Drake


>For the intel:
>
>QUERY PLAN
>Unique  (cost=11.50..11.52 rows=2 width=131)
>  ->  Sort  (cost=11.50..11.50 rows=2 width=131)
>        Sort Key: up.prefix, s.name, s.tuid, s.foundryversion
>        ->  Hash Join  (cost=10.42..11.49 rows=2 width=131)
>              Hash Cond: ("outer".dbid = "inner"."schema")
>              ->  Seq Scan on "schema" s  (cost=0.00..1.02 rows=2 width=128)
>              ->  Hash  (cost=10.41..10.41 rows=4 width=11)
>                    ->  Nested Loop  (cost=0.00..10.41 rows=4 width=11)
>                          ->  Nested Loop  (cost=0.00..2.14 rows=4 width=4)
>                                ->  Seq Scan on flow fl  (cost=0.00..0.00 rows=1 width=4)
>                                      Filter: (servicetype = 646)
>                                ->  Index Scan using usage_flow_i on "usage" u  (cost=0.00..2.06 rows=6 width=8)
>                                      Index Cond: (u.flow = "outer".dbid)
>                          ->  Index Scan using usageparameter_usage_i on usageparameter up  (cost=0.00..2.06 rows=1
width=15)
>                                Index Cond: (up."usage" = "outer".dbid)
>                                Filter: ((prefix)::text <> 'xsd'::text)
>
>For the sparc:
>
>QUERY PLAN
>Unique  (cost=10.81..10.83 rows=1 width=167)
>  ->  Sort  (cost=10.81..10.82 rows=1 width=167)
>        Sort Key: up.prefix, s.name, s.tuid, s.foundryversion
>        ->  Nested Loop  (cost=9.75..10.80 rows=1 width=167)
>              Join Filter: ("outer".flow = "inner".dbid)
>              ->  Hash Join  (cost=9.75..10.79 rows=1 width=171)
>                    Hash Cond: ("outer".dbid = "inner"."schema")
>                    ->  Seq Scan on "schema" s  (cost=0.00..1.02 rows=2 width=128)
>                    ->  Hash  (cost=9.75..9.75 rows=1 width=51)
>                          ->  Nested Loop  (cost=0.00..9.75 rows=1 width=51)
>                                Join Filter: ("inner"."usage" = "outer".dbid)
>                                ->  Index Scan using usage_flow_i on "usage" u  (cost=0.00..4.78 rows=1 width=8)
>                                ->  Index Scan using usageparameter_schema_i on usageparameter up  (cost=0.00..4.96
rows=1width=51) 
>                                      Filter: ((prefix)::text <> 'xsd'::text)
>              ->  Seq Scan on flow fl  (cost=0.00..0.00 rows=1 width=4)
>                    Filter: (servicetype = 646)
>
>I assume the problem with the second plan starts with doing a Nested Loop rather than a Hash Join at the 4th line of
theplan, but I don't know why it would be different for the same schema, same dataset. 
>
>What factors go into the planner's decision to choose a nested loop over a hash join? Should I be looking at adjusting
myruntime configuration on the sparc box somehow? 
>
>Thanks.
>
>- DAP
>----------------------------------------------------------------------------------
>David Parker    Tazz Networks    (401) 709-5130
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 4: Don't 'kill -9' the postmaster
>
>


--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL

Attachment

jd.vcf

Re: query plan question

From

"David Parker"

Date:

17 November 2004, 15:33:14

Oh, I didn't realize that analyze gave that much more info. I've got a
lot to learn about this tuning stuff ;-)

I've attached the output. I see from the new output where the slow query
is taking its time (the nested loop at line 10), but I still have no
idea why this plan is getting chosen....

Thanks!

- DAP

>-----Original Message-----
>From: pgsql-performance-owner@postgresql.org
>[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of
>Russell Smith
>Sent: Tuesday, November 16, 2004 11:36 PM
>To: pgsql-performance@postgresql.org
>Subject: Re: [PERFORM] query plan question
>
>On Wed, 17 Nov 2004 02:54 pm, you wrote:
>> I have a query for which postgres is generating a different
>plan on different machines. The database schema is the same,
>the dataset is the same, the configuration is the same (e.g.,
>pg_autovacuum running in both cases), both systems are Solaris
>9. The main difference in the two systems is that one is sparc
>and the other is intel.
>>
>> The query runs in about 40 ms on the intel box, but takes
>about 18 seconds on the sparc box. Now, the intel boxes we
>have are certainly faster, but I'm curious why the query plan
>might be different.
>>
>> For the intel:
>>
>> QUERY PLAN
>> Unique  (cost=11.50..11.52 rows=2 width=131)
>>   ->  Sort  (cost=11.50..11.50 rows=2 width=131)
>>         Sort Key: up.prefix, s.name, s.tuid, s.foundryversion
>>         ->  Hash Join  (cost=10.42..11.49 rows=2 width=131)
>>               Hash Cond: ("outer".dbid = "inner"."schema")
>>               ->  Seq Scan on "schema" s  (cost=0.00..1.02
>rows=2 width=128)
>>               ->  Hash  (cost=10.41..10.41 rows=4 width=11)
>>                     ->  Nested Loop  (cost=0.00..10.41
>rows=4 width=11)
>>                           ->  Nested Loop  (cost=0.00..2.14
>rows=4 width=4)
>>                                 ->  Seq Scan on flow fl
>(cost=0.00..0.00 rows=1 width=4)
>>                                       Filter: (servicetype = 646)
>>                                 ->  Index Scan using
>usage_flow_i on "usage" u  (cost=0.00..2.06 rows=6 width=8)
>>                                       Index Cond: (u.flow =
>"outer".dbid)
>>                           ->  Index Scan using
>usageparameter_usage_i on usageparameter up  (cost=0.00..2.06
>rows=1 width=15)
>>                                 Index Cond: (up."usage" =
>"outer".dbid)
>>                                 Filter: ((prefix)::text <>
>> 'xsd'::text)
>>
>> For the sparc:
>>
>> QUERY PLAN
>> Unique  (cost=10.81..10.83 rows=1 width=167)
>>   ->  Sort  (cost=10.81..10.82 rows=1 width=167)
>>         Sort Key: up.prefix, s.name, s.tuid, s.foundryversion
>>         ->  Nested Loop  (cost=9.75..10.80 rows=1 width=167)
>>               Join Filter: ("outer".flow = "inner".dbid)
>>               ->  Hash Join  (cost=9.75..10.79 rows=1 width=171)
>>                     Hash Cond: ("outer".dbid = "inner"."schema")
>>                     ->  Seq Scan on "schema" s
>(cost=0.00..1.02 rows=2 width=128)
>>                     ->  Hash  (cost=9.75..9.75 rows=1 width=51)
>>                           ->  Nested Loop  (cost=0.00..9.75
>rows=1 width=51)
>>                                 Join Filter:
>("inner"."usage" = "outer".dbid)
>>                                 ->  Index Scan using
>usage_flow_i on "usage" u  (cost=0.00..4.78 rows=1 width=8)
>>                                 ->  Index Scan using
>usageparameter_schema_i on usageparameter up  (cost=0.00..4.96
>rows=1 width=51)
>>                                       Filter:
>((prefix)::text <> 'xsd'::text)
>>               ->  Seq Scan on flow fl  (cost=0.00..0.00
>rows=1 width=4)
>>                     Filter: (servicetype = 646)
>>
>Unique  (cost=11.50..11.52 rows=2 width=131) Unique
>(cost=10.81..10.83 rows=1 width=167)
>
>The estimations for the cost is basically the same, 10ms for
>the first row.  Can you supply Explain analyze to see what
>it's actually doing?
>
>Russell Smith
>
>---------------------------(end of
>broadcast)---------------------------
>TIP 6: Have you searched our list archives?
>
>               http://archives.postgresql.org
>

Attachment

Re: query plan question

From

"David Parker"

Date:

17 November 2004, 16:09:14

>If they are the same and PostgreSQL are the same, are the
>intel machines Xeons?

Yup, dual 3.06-GHz Intel Xeon Processors.

I'm not sure off the top of my head what the sparcs are exactly. We're
in the process of moving completely to intel, but we still have to
support our app on sparc, and we are seeing these weird differences...

- DAP

Re: query plan question

From

Jeff

Date:

17 November 2004, 17:00:42

On Nov 17, 2004, at 7:32 AM, David Parker wrote:

> Oh, I didn't realize that analyze gave that much more info. I've got a
> lot to learn about this tuning stuff ;-)
>
> I've attached the output. I see from the new output where the slow
> query
> is taking its time (the nested loop at line 10), but I still have no
> idea why this plan is getting chosen....
>

looks like your stats are incorrect on the sparc.
Did you forget to run vacuum analyze on it?

also, do both db's have the same data loaded?
there are some very different numbers in terms of actual rows floating
around there...

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/

Re: query plan question

From

"David Parker"

Date:

17 November 2004, 17:43:51

I've got pg_autovacuum running on both platforms. I've verified that the
tables involved in the query have the same number of rows on both
databases.

I'm not sure where to look to see how the stats might be different. The
"good" database's pg_statistic table has 24 more rows than that in the
"bad" database, so there's definitely a difference. The good database's
pg_statistic has rows for 2 extra tables, but they are not tables
involved in the query in question...

So something must be up with stats, but can you tell me what the most
signicant columns in the pg_statistic table are for the planner making
its decision? I'm sure this has been discussed before, so if there's a
thread you can point me to, that would be great - I realize it's a big
general question.

Thanks for your time.

- DAP

>-----Original Message-----
>From: Jeff [mailto:threshar@torgo.978.org]
>Sent: Wednesday, November 17, 2004 9:01 AM
>To: David Parker
>Cc: Russell Smith; pgsql-performance@postgresql.org
>Subject: Re: [PERFORM] query plan question
>
>
>On Nov 17, 2004, at 7:32 AM, David Parker wrote:
>
>> Oh, I didn't realize that analyze gave that much more info.
>I've got a
>> lot to learn about this tuning stuff ;-)
>>
>> I've attached the output. I see from the new output where the slow
>> query is taking its time (the nested loop at line 10), but I still
>> have no idea why this plan is getting chosen....
>>
>
>looks like your stats are incorrect on the sparc.
>Did you forget to run vacuum analyze on it?
>
>also, do both db's have the same data loaded?
>there are some very different numbers in terms of actual rows floating
>around there...
>
>--
>Jeff Trout <jeff@jefftrout.com>
>http://www.jefftrout.com/
>http://www.stuarthamm.net/
>
>

Re: query plan question

From

"David Parker"

Date:

17 November 2004, 18:06:31

Hmm, I'm really a beginner at this...

It turns out that the pg_statistic table in my good database has records
in it for the tables in the query, while the pg_statistic table in my
bad database has no records for those tables at all!

So I guess I need to figure out why pg_autovacuum isn't analyzing those
tables.

- DAP

>-----Original Message-----
>From: David Parker
>Sent: Wednesday, November 17, 2004 9:44 AM
>To: 'Jeff'
>Cc: Russell Smith; pgsql-performance@postgresql.org
>Subject: RE: [PERFORM] query plan question
>
>I've got pg_autovacuum running on both platforms. I've
>verified that the tables involved in the query have the same
>number of rows on both databases.
>
>I'm not sure where to look to see how the stats might be
>different. The "good" database's pg_statistic table has 24
>more rows than that in the "bad" database, so there's
>definitely a difference. The good database's pg_statistic has
>rows for 2 extra tables, but they are not tables involved in
>the query in question...
>
>So something must be up with stats, but can you tell me what
>the most signicant columns in the pg_statistic table are for
>the planner making its decision? I'm sure this has been
>discussed before, so if there's a thread you can point me to,
>that would be great - I realize it's a big general question.
>
>Thanks for your time.
>
>- DAP
>
>>-----Original Message-----
>>From: Jeff [mailto:threshar@torgo.978.org]
>>Sent: Wednesday, November 17, 2004 9:01 AM
>>To: David Parker
>>Cc: Russell Smith; pgsql-performance@postgresql.org
>>Subject: Re: [PERFORM] query plan question
>>
>>
>>On Nov 17, 2004, at 7:32 AM, David Parker wrote:
>>
>>> Oh, I didn't realize that analyze gave that much more info.
>>I've got a
>>> lot to learn about this tuning stuff ;-)
>>>
>>> I've attached the output. I see from the new output where the slow
>>> query is taking its time (the nested loop at line 10), but I still
>>> have no idea why this plan is getting chosen....
>>>
>>
>>looks like your stats are incorrect on the sparc.
>>Did you forget to run vacuum analyze on it?
>>
>>also, do both db's have the same data loaded?
>>there are some very different numbers in terms of actual rows
>floating
>>around there...
>>
>>--
>>Jeff Trout <jeff@jefftrout.com>
>>http://www.jefftrout.com/
>>http://www.stuarthamm.net/
>>
>>

Re: query plan question

From

Tom Lane

Date:

17 November 2004, 18:46:44

"David Parker" <dparker@tazznetworks.com> writes:
> So I guess I need to figure out why pg_autovacuum isn't analyzing those
> tables.

Which autovacuum version are you using?  The early releases had some
nasty bugs that would allow it to skip tables sometimes.  I think all
the known problems are fixed as of recent 7.4.x updates.

            regards, tom lane

Re: query plan question

From

"David Parker"

Date:

17 November 2004, 18:59:57

We're using postgresql 7.4.5. I've only recently put pg_autovacuum in
place as part of our installation, and I'm basically taking the
defaults. I doubt it's a problem with autovacuum itself, but rather with
my configuration of it. I have some reading to do, so any pointers to
existing autovacuum threads would be greatly appreciated!

Thanks.

- DAP

>-----Original Message-----
>From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
>Sent: Wednesday, November 17, 2004 10:46 AM
>To: David Parker
>Cc: Jeff; Russell Smith; pgsql-performance@postgresql.org
>Subject: Re: [PERFORM] query plan question
>
>"David Parker" <dparker@tazznetworks.com> writes:
>> So I guess I need to figure out why pg_autovacuum isn't analyzing
>> those tables.
>
>Which autovacuum version are you using?  The early releases
>had some nasty bugs that would allow it to skip tables
>sometimes.  I think all the known problems are fixed as of
>recent 7.4.x updates.
>
>            regards, tom lane
>

Re: query plan question

From

"Matthew T. O'Connor"

Date:

17 November 2004, 19:43:28

David Parker wrote:

>We're using postgresql 7.4.5. I've only recently put pg_autovacuum in
>place as part of our installation, and I'm basically taking the
>defaults. I doubt it's a problem with autovacuum itself, but rather with
>my configuration of it. I have some reading to do, so any pointers to
>existing autovacuum threads would be greatly appreciated!
>

Well the first thing to do is increase the verbosity of the
pg_autovacuum logging output.  If you use -d2 or higher, pg_autovacuum
will print out a lot of detail on what it thinks the thresholds are and
why it is or isn't performing vacuums and analyzes.   Attach some of the
log and I'll take a look at it.

Re: query plan question

From

"Matthew T. O'Connor"

Date:

17 November 2004, 22:04:26

Well based on the autovacuum log that you attached, all of those tables
are insert only (at least during the time period included in the log.
Is that correct?  If so, autovacuum will never do a vacuum (unless
required by xid wraparound issues) on those tables.  So this doesn't
appear to be an autovacuum problem.  I'm not sure about the missing
pg_statistic entries anyone else care to field that one?

Matthew


David Parker wrote:

>Thanks. The tables I'm concerned with are named: 'schema', 'usage',
>'usageparameter', and 'flow'. It looks like autovacuum is performing
>analyzes:
>
>% grep "Performing: " logs/.db.tazz.vacuum.log
>[2004-11-17 12:05:58 PM] Performing: ANALYZE
>"public"."scriptlibrary_library"
>[2004-11-17 12:15:59 PM] Performing: ANALYZE
>"public"."scriptlibraryparm"
>[2004-11-17 12:15:59 PM] Performing: ANALYZE "public"."usageparameter"
>[2004-11-17 12:21:00 PM] Performing: ANALYZE "public"."usageproperty"
>[2004-11-17 12:21:00 PM] Performing: ANALYZE "public"."route"
>[2004-11-17 12:21:00 PM] Performing: ANALYZE "public"."usageparameter"
>[2004-11-17 12:21:00 PM] Performing: ANALYZE
>"public"."scriptlibrary_library"
>[2004-11-17 12:26:01 PM] Performing: ANALYZE "public"."usage"
>[2004-11-17 12:26:01 PM] Performing: ANALYZE "public"."usageparameter"
>[2004-11-17 12:31:04 PM] Performing: ANALYZE "public"."usageproperty"
>[2004-11-17 12:36:04 PM] Performing: ANALYZE "public"."route"
>[2004-11-17 12:36:04 PM] Performing: ANALYZE "public"."service_usage"
>[2004-11-17 12:36:04 PM] Performing: ANALYZE "public"."usageparameter"
>
>But when I run the following:
>
>select * from pg_statistic where starelid in
>(select oid from pg_class where relname in
>('schema','usageparameter','flow','usage'))
>
>it returns no records. Shouldn't it? It doesn't appear to be doing a
>vacuum anywhere, which makes sense because none of these tables have
>over the default threshold of 1000. Are there statistics which only get
>generated by vacuum?
>
>I've attached a gzip of the pg_autovacuum log file, with -d 3.
>
>Thanks again.
>
>- DAP
>
>
>
>
>>-----Original Message-----
>>From: Matthew T. O'Connor [mailto:matthew@zeut.net]
>>Sent: Wednesday, November 17, 2004 11:41 AM
>>To: David Parker
>>Cc: Tom Lane; Jeff; Russell Smith; pgsql-performance@postgresql.org
>>Subject: Re: [PERFORM] query plan question
>>
>>David Parker wrote:
>>
>>
>>
>>>We're using postgresql 7.4.5. I've only recently put pg_autovacuum in
>>>place as part of our installation, and I'm basically taking the
>>>defaults. I doubt it's a problem with autovacuum itself, but rather
>>>with my configuration of it. I have some reading to do, so
>>>
>>>
>>any pointers
>>
>>
>>>to existing autovacuum threads would be greatly appreciated!
>>>
>>>
>>>
>>Well the first thing to do is increase the verbosity of the
>>pg_autovacuum logging output.  If you use -d2 or higher,
>>pg_autovacuum will print out a lot of detail on what it thinks
>>the thresholds are and
>>why it is or isn't performing vacuums and analyzes.   Attach
>>some of the
>>log and I'll take a look at it.
>>
>>
>>

Re: query plan question

From

"David Parker"

Date:

18 November 2004, 19:12:24

What I think is happening with the missing pg_statistic entries:

The install of our application involves a lot of data importing (via
JDBC) in one large transaction, which can take up to 30 minutes. (I
realize I left out this key piece of info in my original post...)

The pg_autovacuum logic is relying on data from pg_stat_all_tables to
make the decision about running analyze. As far as I can tell, the data
in this view gets updated outside of the transaction, because I saw the
numbers growing while I was importing. I saw pg_autovacuum log messages
for running analyze on several tables, but no statistics data showed up
for these, I assume because the actual data in the table wasn't yet
visible to pg_autovacuum because the import transaction had not finished
yet.

When the import finished, not all of the tables affected by the import
were re-visited because they had not bumped up over the threshold again,
even though the analyze run for those tables had not generated any stats
because of the still-open transaction.

Am I making the correct assumptions about the way the various pieces
work? Does this scenario make sense?

It's easy enough for us to kick off a vacuum/analyze at the end of a
long import - but this "mysterious" behavior was bugging me!

Thanks.

- DAP

>-----Original Message-----
>From: Matthew T. O'Connor [mailto:matthew@zeut.net]
>Sent: Wednesday, November 17, 2004 2:02 PM
>To: David Parker
>Cc: Tom Lane; Jeff; Russell Smith; pgsql-performance@postgresql.org
>Subject: Re: [PERFORM] query plan question
>
>Well based on the autovacuum log that you attached, all of
>those tables
>are insert only (at least during the time period included in
>the log.
>Is that correct?  If so, autovacuum will never do a vacuum
>(unless required by xid wraparound issues) on those tables.
>So this doesn't appear to be an autovacuum problem.  I'm not
>sure about the missing pg_statistic entries anyone else care
>to field that one?
>
>Matthew
>
>
>David Parker wrote:
>
>>Thanks. The tables I'm concerned with are named: 'schema', 'usage',
>>'usageparameter', and 'flow'. It looks like autovacuum is performing
>>analyzes:
>>
>>% grep "Performing: " logs/.db.tazz.vacuum.log
>>[2004-11-17 12:05:58 PM] Performing: ANALYZE
>>"public"."scriptlibrary_library"
>>[2004-11-17 12:15:59 PM] Performing: ANALYZE
>>"public"."scriptlibraryparm"
>>[2004-11-17 12:15:59 PM] Performing: ANALYZE "public"."usageparameter"
>>[2004-11-17 12:21:00 PM] Performing: ANALYZE "public"."usageproperty"
>>[2004-11-17 12:21:00 PM] Performing: ANALYZE "public"."route"
>>[2004-11-17 12:21:00 PM] Performing: ANALYZE "public"."usageparameter"
>>[2004-11-17 12:21:00 PM] Performing: ANALYZE
>>"public"."scriptlibrary_library"
>>[2004-11-17 12:26:01 PM] Performing: ANALYZE "public"."usage"
>>[2004-11-17 12:26:01 PM] Performing: ANALYZE "public"."usageparameter"
>>[2004-11-17 12:31:04 PM] Performing: ANALYZE "public"."usageproperty"
>>[2004-11-17 12:36:04 PM] Performing: ANALYZE "public"."route"
>>[2004-11-17 12:36:04 PM] Performing: ANALYZE "public"."service_usage"
>>[2004-11-17 12:36:04 PM] Performing: ANALYZE "public"."usageparameter"
>>
>>But when I run the following:
>>
>>select * from pg_statistic where starelid in (select oid from
>pg_class
>>where relname in
>>('schema','usageparameter','flow','usage'))
>>
>>it returns no records. Shouldn't it? It doesn't appear to be doing a
>>vacuum anywhere, which makes sense because none of these tables have
>>over the default threshold of 1000. Are there statistics
>which only get
>>generated by vacuum?
>>
>>I've attached a gzip of the pg_autovacuum log file, with -d 3.
>>
>>Thanks again.
>>
>>- DAP
>>
>>
>>
>>
>>>-----Original Message-----
>>>From: Matthew T. O'Connor [mailto:matthew@zeut.net]
>>>Sent: Wednesday, November 17, 2004 11:41 AM
>>>To: David Parker
>>>Cc: Tom Lane; Jeff; Russell Smith; pgsql-performance@postgresql.org
>>>Subject: Re: [PERFORM] query plan question
>>>
>>>David Parker wrote:
>>>
>>>
>>>
>>>>We're using postgresql 7.4.5. I've only recently put
>pg_autovacuum in
>>>>place as part of our installation, and I'm basically taking the
>>>>defaults. I doubt it's a problem with autovacuum itself, but rather
>>>>with my configuration of it. I have some reading to do, so
>>>>
>>>>
>>>any pointers
>>>
>>>
>>>>to existing autovacuum threads would be greatly appreciated!
>>>>
>>>>
>>>>
>>>Well the first thing to do is increase the verbosity of the
>>>pg_autovacuum logging output.  If you use -d2 or higher,
>pg_autovacuum
>>>will print out a lot of detail on what it thinks the thresholds are
>>>and
>>>why it is or isn't performing vacuums and analyzes.   Attach
>>>some of the
>>>log and I'll take a look at it.
>>>
>>>
>>>
>
>