Re: Postgres query completion status? - Mailing list pgsql-performance

From Greg Williamson
Subject Re: Postgres query completion status?
Date
Msg-id 15626.50723.qm@web46111.mail.sp1.yahoo.com
Whole thread Raw
In response to Re: Postgres query completion status?  (Thom Brown <thombrown@gmail.com>)
List pgsql-performance
Sorry for top-posting -- challenged mail client.

Thom's suggestion that the estimates are off seems like a useful line of inquiry, but ANALYZE is what builds statistics. If it is not run often enough the planner will base its idea of what a good plan is on bad data. So ANALYZE <table name>; is your friend. You may need to change the statistics for the tables in question if there are odd distributions of data -- as Thom asked -- which version of PostgreSQL ?

Stay away from VACUUM FULL ! It will block other activity and will be horribly slow on large tables. It will get rid of bloat but there may be better ways of doing that depending on what version you are using and what you maintenance window looks like.

HTH,

Greg W.


From: Thom Brown <thombrown@gmail.com>
To: Richard Neill <rn214@cam.ac.uk>
Cc: Greg Williamson <gwilliamson39@yahoo.com>; pgsql-performance@postgresql.org
Sent: Fri, November 20, 2009 4:13:03 AM
Subject: Re: [PERFORM] Postgres query completion status?

2009/11/20 Richard Neill <rn214@cam.ac.uk>

Greg Williamson wrote:
Richard --

 You might post the results of "EXPLAIN ANALYZE <your SQL here>;" ... be sure to run it in a transaction if you want to be able roll it back. Perhaps try "EXPLAIN <your SQL>;" first as it is faster, but EXPLAIN ANALYZE shows what the planner is doing.


Here's something very very odd.
Explain Analyze has now run, in about 4 minutes.  (result below)

However, I'd be willing to swear that the last time I ran explain on this query about half an hour ago, the final 2 lines were sequential scans.

So, I've just terminated the real job (which uses this select for an update) after 77 minutes of fruitless cpu-hogging, and re-started it....

...This time, the same job ran through in 24 minutes.
[This is running exactly the same transaction on exactly the same data!]


Richard



It looks like your statistics are way out of sync with the real data.

> Nested Loop  (cost=885367.03..1123996.87 rows=8686 width=12) (actual time=248577.879..253168.466 rows=347308 loops=1)

This shows that it thinks there will be 8,686 rows, but actually traverses 347,308.

Have you manually run a VACUUM on these tables?  Preferrably a full one if you can.  I notice that you appear ot have multiple sorts going on.  Are all of those actually necessary for your output?  Also consider using partial or multicolumn indexes where useful.

And which version of PostgreSQL are you using?

Thom

pgsql-performance by date:

Previous
From: Thom Brown
Date:
Subject: Re: Postgres query completion status?
Next
From: Tom Lane
Date:
Subject: Re: [GENERAL] Strange performance degradation