Thread: Postgres query completion status?

Postgres query completion status?

From

Richard Neill

Date:

20 November 2009, 04:38:13

Dear All,

I've just joined this list, so let me first thank you in advance for
your hospitality.

I'm having lots of trouble with variously slow running queries on a
production system. I've tried all the "obvious" fixes: changing the
query planner, checking for indexing, autovacuum, making sure the thing
has heaps of memory (20GB), running on solid state disks etc.


1. Is there any way to debug or trace a running query? I think I have
all the logging options turned on, but I'd like to see something like:
  "Currently reading 3452 rows from table_x, at 0.2 us per row" or
whatever, being really, really verbose in the logfiles.

Likewise, is there any way to check whether, for example, postgres is
running out of work memory?



2. Is there any way, whatsoever, to get any kind of "progress bar" for a
running query? I know these things have terrible faults, but anything
monotonic would be better than running completely blind.

[There's a wonderful paper here:
http://pages.cs.wisc.edu/~naughton/includes/papers/multiquery.pdf
which seems to have got 90% of the way there, but appears to have been
abandoned as it didn't get all 100% of the way]


The operations people in the warehouse are currently going crazy because
  we can't ever answer the question "when will this query complete?". I
know it's hard to do accurately, but knowing the difference between "5
seconds to go" and "5 hours to go" would be so fantastically useful.

Thanks,

Richard



P.S. Sometimes, some queries seem to benefit from being cancelled and
then immediately re-started. As there are being run in a transaction, I
can't see how this could make a difference. Am I missing anything
obvious?  Occasionally, a re-start of postgresql-8.4l itself seems to help.

Re: Postgres query completion status?

From

Greg Williamson

Date:

20 November 2009, 05:07:58

Richard --

 You might post the results of "EXPLAIN ANALYZE <your SQL here>;" ... be sure to run it in a transaction if you want to
beable roll it back. Perhaps try "EXPLAIN <your SQL>;" first as it is faster, but EXPLAIN ANALYZE shows what the
planneris doing. 

You wrote:

>
> P.S. Sometimes, some queries seem to benefit from being cancelled and then immediately
> re-started. As there are being run in a transaction, I can't see how this could make a difference.
> Am I missing anything obvious?  Occasionally, a re-start of postgresql-8.4l itself seems to help.

This may be the result of caching of the desired rows, either by PostgreSQL or by your operating system. The rollback
wouldn'teffect this -- the rows are already in memory and not on disk waiting to be grabbed -- much faster on all
subsequentqueries. 

HTH,

Greg Williamson

-- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: Postgres query completion status?

From

Richard Neill

Date:

20 November 2009, 05:32:43

Thanks for your help. This issue splits into 2 bits:

1. Fixing specific queries.

2. Finding out when a specific running query is going to complete.
(At the moment, this is the bit I really need to know).


Greg Williamson wrote:
> Richard --
>
>  You might post the results of "EXPLAIN ANALYZE <your SQL here>;" ... be sure to run it in a transaction if you want
tobe able roll it back. Perhaps try "EXPLAIN <your SQL>;" first as it is faster, but EXPLAIN ANALYZE shows what the
planneris doing. 
>

The offending query (simplified to just do a select - which is the slow
bit) is:


-------------
SELECT ( core.demand.qty - viwcs.wave_end_demand.qty_remaining ) FROM
core.demand, viwcs.previous_wave LEFT OUTER JOIN viwcs.wave_end_demand
USING ( wid ) WHERE core.demand.id = viwcs.wave_end_demand.demand_id;
------------


Over the last few weeks, this has gradually slowed down from 6 minutes
to about 6.5, then last night it took 25, and today it's taken an hour
already and still not completed. The system hasn't been doing anything
special in the last 2 days.



Here's EXPLAIN  (Explain analyze will take too long!)


-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop  (cost=885367.03..1123996.87 rows=8686 width=12)
    ->  Merge Join  (cost=885367.03..1115452.17 rows=8688 width=16)
          Merge Cond: ((core.demand.target_id =
wave_genreorders_map.target_id) AND (core.demand.material_id =
core.material.id))
          ->  Index Scan using demand_target_id_key on demand
(cost=0.00..186520.46 rows=3800715 width=24)
          ->  Sort  (cost=885364.61..893425.30 rows=3224275 width=24)
                Sort Key: wave_genreorders_map.target_id, core.material.id
                ->  Hash Join  (cost=511934.12..536811.73 rows=3224275
width=24)
                      Hash Cond: (core.material.tag =
(product_info_sku.sid)::text)
                      ->  Append  (cost=0.00..10723.27 rows=689377 width=28)
                            ->  Seq Scan on material
(cost=0.00..5474.75 rows=397675 width=21)
                            ->  Seq Scan on container material
(cost=0.00..5248.52 rows=291702 width=37)
                      ->  Hash  (cost=506657.25..506657.25 rows=422149
width=42)
                            ->  Hash Join  (cost=474610.85..506657.25
rows=422149 width=42)
                                  Hash Cond: ((wave_gol.sid)::text =
(product_info_sku.sid)::text)
                                  ->  Merge Left Join
(cost=463919.35..487522.78 rows=422149 width=29)
                                        Merge Cond:
(((wave_gol.wid)::text = (du_report_sku.wid)::text) AND
((wave_gol.storeorderid)::text = (du_report_sku.storeorderid)::text) AND
((wave_gol.genreorderid)::text = (du_report_sku.genreorderid)::text))
                                        Join Filter:
((wave_gol.sid)::text = (du_report_sku.sid)::text)
                                        ->  Merge Join
(cost=183717.70..197229.24 rows=422149 width=44)
                                              Merge Cond:
(((wave_genreorders_map.wid)::text = (wave_gol.wid)::text) AND
((wave_genreorders_map.storeorderid)::text =
(wave_gol.storeorderid)::text) AND
((wave_genreorders_map.genreorderid)::text = (wave_gol.genreorderid)::text))
                                              ->  Index Scan using
"wave_genreorders_map_ERR_GENREORDERID_EXISTS" on wave_genreorders_map
(cost=0.00..4015.36 rows=116099 width=27)
                                              ->  Sort
(cost=183717.70..184818.90 rows=440483 width=47)
                                                    Sort Key:
wave_gol.wid, wave_gol.storeorderid, wave_gol.genreorderid
                                                    ->  Nested Loop
(cost=9769.36..142425.22 rows=440483 width=47)
                                                          ->  Index Scan
using "wave_rxw_ERR_WID_EXISTS" on wave_rxw  (cost=0.00..7.08 rows=1
width=11)
                                                                Filter:
is_previous
                                                          ->  Bitmap
Heap Scan on wave_gol  (cost=9769.36..136912.11 rows=440483 width=36)
                                                                Recheck
Cond: ((wave_gol.wid)::text = (wave_rxw.wid)::text)
                                                                ->
Bitmap Index Scan on "wave_gol_ERR_SID_EXISTS"  (cost=0.00..9659.24
rows=440483 width=0)

Index Cond: ((wave_gol.wid)::text = (wave_rxw.wid)::text)
                                        ->  Sort
(cost=280201.66..281923.16 rows=688602 width=300)
                                              Sort Key:
du_report_sku.wid, du_report_sku.storeorderid, du_report_sku.genreorderid
                                              ->  HashAggregate
(cost=197936.75..206544.27 rows=688602 width=36)
                                                    ->  Seq Scan on
du_report_sku  (cost=0.00..111861.61 rows=6886011 width=36)
                                  ->  Hash  (cost=5681.22..5681.22
rows=400822 width=13)
                                        ->  Seq Scan on product_info_sku
  (cost=0.00..5681.22 rows=400822 width=13)
    ->  Index Scan using demand_pkey on demand  (cost=0.00..0.97 rows=1
width=12)
          Index Cond: (core.demand.id = core.demand.id)
(37 rows)

--------------------------------------------------



> You wrote:
>
>
>
>> P.S. Sometimes, some queries seem to benefit from being cancelled and then immediately
>> re-started. As there are being run in a transaction, I can't see how this could make a difference.
>> Am I missing anything obvious?  Occasionally, a re-start of postgresql-8.4l itself seems to help.
>
> This may be the result of caching of the desired rows, either by PostgreSQL or by your operating system. The rollback
wouldn'teffect this -- the rows are already in memory and not on  
disk waiting to be grabbed -- much faster on all subsequent queries.


Yes...but the data should already be in RAM. We've got 20 GB of it,
(Postgres is given 5GB, and the effective_cache_size is 10GB); the
dataset size for the relevant part should only be about 100 MB at the most.

Also we're using solid-state disks (Intel X25-E), and iotop shows that
the disk access rate isn't the problem; the CPU is pegged at 100% though.

It seems to be that the query-planner is doing something radically
different.


Richard

Re: Postgres query completion status?

From

Richard Neill

Date:

20 November 2009, 05:50:27

> Greg Williamson wrote:
>> Richard --
>>
>>  You might post the results of "EXPLAIN ANALYZE <your SQL here>;" ...
>> be sure to run it in a transaction if you want to be able roll it
>> back. Perhaps try "EXPLAIN <your SQL>;" first as it is faster, but
>> EXPLAIN ANALYZE shows what the planner is doing.
>>
>

Is there any way I can gather some information by tracing the query
that's currently actually running?

strace doesn't help much, but is there anything else I can do?

As far as I know, the only tools that exist are
   pg_stat_activity, top, and iotop
Have I missed one?

Thanks,

Richard

Re: Postgres query completion status?

From

Richard Neill

Date:

20 November 2009, 06:03:18

>
> Greg Williamson wrote:
>> Richard --
>>
>>  You might post the results of "EXPLAIN ANALYZE <your SQL here>;" ...
>> be sure to run it in a transaction if you want to be able roll it
>> back. Perhaps try "EXPLAIN <your SQL>;" first as it is faster, but
>> EXPLAIN ANALYZE shows what the planner is doing.


Here's something very very odd.
Explain Analyze has now run, in about 4 minutes.  (result below)

However, I'd be willing to swear that the last time I ran explain on
this query about half an hour ago, the final 2 lines were sequential scans.

So, I've just terminated the real job (which uses this select for an
update) after 77 minutes of fruitless cpu-hogging, and re-started it....

...This time, the same job ran through in 24 minutes.
[This is running exactly the same transaction on exactly the same data!]


Richard






-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Nested Loop  (cost=885367.03..1123996.87 rows=8686 width=12) (actual
time=248577.879..253168.466 rows=347308 loops=1)
    ->  Merge Join  (cost=885367.03..1115452.17 rows=8688 width=16)
(actual time=248577.834..252092.536 rows=347308 loops=1)
          Merge Cond: ((core.demand.target_id =
wave_genreorders_map.target_id) AND (core.demand.material_id =
core.material.id))
          ->  Index Scan using demand_target_id_key on demand
(cost=0.00..186520.46 rows=3800715 width=24) (actual
time=0.031..2692.661 rows=3800715 loops=1)
          ->  Sort  (cost=885364.61..893425.30 rows=3224275 width=24)
(actual time=248577.789..248659.751 rows=347308 loops=1)
                Sort Key: wave_genreorders_map.target_id, core.material.id
                Sort Method:  quicksort  Memory: 39422kB
                ->  Hash Join  (cost=511934.12..536811.73 rows=3224275
width=24) (actual time=247444.988..248263.151 rows=347308 loops=1)
                      Hash Cond: (core.material.tag =
(product_info_sku.sid)::text)
                      ->  Append  (cost=0.00..10723.27 rows=689377
width=28) (actual time=0.008..177.076 rows=690647 loops=1)
                            ->  Seq Scan on material
(cost=0.00..5474.75 rows=397675 width=21) (actual time=0.008..59.234
rows=395551 loops=1)
                            ->  Seq Scan on container material
(cost=0.00..5248.52 rows=291702 width=37) (actual time=0.008..52.844
rows=295096 loops=1)
                      ->  Hash  (cost=506657.25..506657.25 rows=422149
width=42) (actual time=247444.555..247444.555 rows=347308 loops=1)
                            ->  Hash Join  (cost=474610.85..506657.25
rows=422149 width=42) (actual time=182224.904..247282.711 rows=347308
loops=1)
                                  Hash Cond: ((wave_gol.sid)::text =
(product_info_sku.sid)::text)
                                  ->  Merge Left Join
(cost=463919.35..487522.78 rows=422149 width=29) (actual
time=182025.065..246638.762 rows=347308 loops=1)
                                        Merge Cond:
(((wave_gol.wid)::text = (du_report_sku.wid)::text) AND
((wave_gol.storeorderid)::text = (du_report_sku.storeorderid)::text) AND
((wave_gol.genreorderid)::text = (du_report_sku.genreorderid)::text))
                                        Join Filter:
((wave_gol.sid)::text = (du_report_sku.sid)::text)
                                        ->  Merge Join
(cost=183717.70..197229.24 rows=422149 width=44) (actual
time=859.551..1506.515 rows=347308 loops=1)
                                              Merge Cond:
(((wave_genreorders_map.wid)::text = (wave_gol.wid)::text) AND
((wave_genreorders_map.storeorderid)::text =
(wave_gol.storeorderid)::text) AND
((wave_genreorders_map.genreorderid)::text = (wave_gol.genreorderid)::text))
                                              ->  Index Scan using
"wave_genreorders_map_ERR_GENREORDERID_EXISTS" on wave_genreorders_map
(cost=0.00..4015.36 rows=116099 width=27) (actual time=0.018..23.599
rows=116099 loops=1)
                                              ->  Sort
(cost=183717.70..184818.90 rows=440483 width=47) (actual
time=782.102..813.753 rows=347308 loops=1)
                                                    Sort Key:
wave_gol.wid, wave_gol.storeorderid, wave_gol.genreorderid
                                                    Sort Method:
quicksort  Memory: 39422kB
                                                    ->  Nested Loop
(cost=9769.36..142425.22 rows=440483 width=47) (actual
time=33.668..138.668 rows=347308 loops=1)
                                                          ->  Index Scan
using "wave_rxw_ERR_WID_EXISTS" on wave_rxw  (cost=0.00..7.08 rows=1
width=11) (actual time=0.021..0.031 rows=1 loops=1)
                                                                Filter:
is_previous
                                                          ->  Bitmap
Heap Scan on wave_gol  (cost=9769.36..136912.11 rows=440483 width=36)
(actual time=33.628..75.091 rows=347308 loops=1)
                                                                Recheck
Cond: ((wave_gol.wid)::text = (wave_rxw.wid)::text)
                                                                ->
Bitmap Index Scan on "wave_gol_ERR_SID_EXISTS"  (cost=0.00..9659.24
rows=440483 width=0) (actual time=33.104..33.104 rows=347308 loops=1)

Index Cond: ((wave_gol.wid)::text = (wave_rxw.wid)::text)
                                        ->  Sort
(cost=280201.66..281923.16 rows=688602 width=300) (actual
time=177511.806..183486.593 rows=41317448 loops=1)
                                              Sort Key:
du_report_sku.wid, du_report_sku.storeorderid, du_report_sku.genreorderid
                                              Sort Method:  external
sort  Disk: 380768kB
                                              ->  HashAggregate
(cost=197936.75..206544.27 rows=688602 width=36) (actual
time=7396.426..11224.839 rows=6282564 loops=1)
                                                    ->  Seq Scan on
du_report_sku  (cost=0.00..111861.61 rows=6886011 width=36) (actual
time=0.006..573.419 rows=6897682 loops=1)
                                  ->  Hash  (cost=5681.22..5681.22
rows=400822 width=13) (actual time=199.422..199.422 rows=400737 loops=1)
                                        ->  Seq Scan on product_info_sku
  (cost=0.00..5681.22 rows=400822 width=13) (actual time=0.004..78.357
rows=400737 loops=1)
    ->  Index Scan using demand_pkey on demand  (cost=0.00..0.97 rows=1
width=12) (actual time=0.002..0.003 rows=1 loops=347308)
          Index Cond: (core.demand.id = core.demand.id)
  Total runtime: 253455.603 ms
(41 rows)

Re: Postgres query completion status?

From

Greg Smith

Date:

20 November 2009, 10:57:01

Richard Neill wrote:
> As far as I know, the only tools that exist are
>   pg_stat_activity, top, and iotop
> Have I missed one?
The ui for pgTop might be easier for what you're trying to do:
http://pgfoundry.org/projects/pgtop/

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com

Re: Postgres query completion status?

From

Thom Brown

Date:

20 November 2009, 11:13:35

2009/11/20 Richard Neill <rn214@cam.ac.uk>

Greg Williamson wrote:
Richard --

You might post the results of "EXPLAIN ANALYZE <your SQL here>;" ... be sure to run it in a transaction if you want to be able roll it back. Perhaps try "EXPLAIN <your SQL>;" first as it is faster, but EXPLAIN ANALYZE shows what the planner is doing.

Here's something very very odd.
Explain Analyze has now run, in about 4 minutes. (result below)

However, I'd be willing to swear that the last time I ran explain on this query about half an hour ago, the final 2 lines were sequential scans.

So, I've just terminated the real job (which uses this select for an update) after 77 minutes of fruitless cpu-hogging, and re-started it....

...This time, the same job ran through in 24 minutes.
[This is running exactly the same transaction on exactly the same data!]

Richard

It looks like your statistics are way out of sync with the real data.

> Nested Loop (cost=885367.03..1123996.87 rows=8686 width=12) (actual time=248577.879..253168.466 rows=347308 loops=1)

This shows that it thinks there will be 8,686 rows, but actually traverses 347,308.

Have you manually run a VACUUM on these tables? Preferrably a full one if you can. I notice that you appear ot have multiple sorts going on. Are all of those actually necessary for your output? Also consider using partial or multicolumn indexes where useful.

And which version of PostgreSQL are you using?

Thom

Re: Postgres query completion status?

From

Greg Williamson

Date:

20 November 2009, 12:16:18

Sorry for top-posting -- challenged mail client.

Thom's suggestion that the estimates are off seems like a useful line of inquiry, but ANALYZE is what builds statistics. If it is not run often enough the planner will base its idea of what a good plan is on bad data. So ANALYZE <table name>; is your friend. You may need to change the statistics for the tables in question if there are odd distributions of data -- as Thom asked -- which version of PostgreSQL ?

Stay away from VACUUM FULL ! It will block other activity and will be horribly slow on large tables. It will get rid of bloat but there may be better ways of doing that depending on what version you are using and what you maintenance window looks like.

HTH,

Greg W.

From: Thom Brown <thombrown@gmail.com>
To: Richard Neill <rn214@cam.ac.uk>
Cc: Greg Williamson <gwilliamson39@yahoo.com>; pgsql-performance@postgresql.org
Sent: Fri, November 20, 2009 4:13:03 AM
Subject: Re: [PERFORM] Postgres query completion status?

2009/11/20 Richard Neill <rn214@cam.ac.uk>

Greg Williamson wrote:
Richard --

You might post the results of "EXPLAIN ANALYZE <your SQL here>;" ... be sure to run it in a transaction if you want to be able roll it back. Perhaps try "EXPLAIN <your SQL>;" first as it is faster, but EXPLAIN ANALYZE shows what the planner is doing.

Here's something very very odd.
Explain Analyze has now run, in about 4 minutes. (result below)

However, I'd be willing to swear that the last time I ran explain on this query about half an hour ago, the final 2 lines were sequential scans.

So, I've just terminated the real job (which uses this select for an update) after 77 minutes of fruitless cpu-hogging, and re-started it....

...This time, the same job ran through in 24 minutes.
[This is running exactly the same transaction on exactly the same data!]

Richard

Re: Postgres query completion status?

From

"Kevin Grittner"

Date:

20 November 2009, 14:24:35

Richard Neill  wrote:

> SELECT ( core.demand.qty - viwcs.wave_end_demand.qty_remaining )
>   FROM
>     core.demand,
>     viwcs.previous_wave
>     LEFT OUTER JOIN viwcs.wave_end_demand USING ( wid )
>   WHERE core.demand.id = viwcs.wave_end_demand.demand_id;

For comparison, how does this do?:

SELECT (core.demand.qty - viwcs.wave_end_demand.qty_remaining)
  FROM core.demand,
  JOIN viwcs.previous_wave
    ON (core.demand.id = viwcs.wave_end_demand.demand_id)
  LEFT OUTER JOIN viwcs.wave_end_demand USING (wid);

-Kevin

Re: Postgres query completion status?

From

Richard Neill

Date:

20 November 2009, 18:00:38

Kevin Grittner wrote:
> Richard Neill  wrote:
>
>> SELECT ( core.demand.qty - viwcs.wave_end_demand.qty_remaining )
>>   FROM
>>     core.demand,
>>     viwcs.previous_wave
>>     LEFT OUTER JOIN viwcs.wave_end_demand USING ( wid )
>>   WHERE core.demand.id = viwcs.wave_end_demand.demand_id;
>
> For comparison, how does this do?:
>
> SELECT (core.demand.qty - viwcs.wave_end_demand.qty_remaining)
>   FROM core.demand
>   JOIN viwcs.previous_wave
>     ON (core.demand.id = viwcs.wave_end_demand.demand_id)
>   LEFT OUTER JOIN viwcs.wave_end_demand USING (wid);
>


Thanks for your help,

Unfortunately, it just complains:

ERROR:  missing FROM-clause entry for table "wave_end_demand"
LINE 4:   ON (core.demand.id = viwcs.wave_end_demand.demand_id)

Incidentally, I don't think that this particular re-ordering will make
much difference: viwcs.previous_wave is a table with a single row, and 3
columns in it. Here are the bits of schema, if they're helpful.


            View "viwcs.wave_end_demand"
     Column     |         Type          | Modifiers
---------------+-----------------------+-----------
  wid           | character varying(10) |
  storeorderid  | character varying(30) |
  genreorderid  | character varying(30) |
  target_id     | bigint                |
  sid           | character varying(30) |
  material_id   | bigint                |
  demand_id     | bigint                |
  eqa           | integer               |
  aqu           | bigint                |
  qty_remaining | bigint                |
View definition:
  SELECT wave_gol.wid, wave_gol.storeorderid, wave_gol.genreorderid,
wave_genreorders_map.target_id, wave_gol.sid,
product_info_sku_map.material_id, demand.id AS demand_id, wave_gol.eqa,
COALESCE(du_report_sku_sum.aqu, 0::bigint) AS aqu, wave_gol.eqa -
COALESCE(du_report_sku_sum.aqu, 0::bigint) AS qty_remaining
    FROM viwcs.wave_gol
    LEFT JOIN viwcs.wave_genreorders_map USING (wid, storeorderid,
genreorderid)
    LEFT JOIN viwcs.product_info_sku_map USING (sid)
    LEFT JOIN core.demand USING (target_id, material_id)
    LEFT JOIN ( SELECT du_report_sku.wid, du_report_sku.storeorderid,
du_report_sku.genreorderid, du_report_sku.sid, sum(du_report_sku.aqu) AS aqu
    FROM viwcs.du_report_sku
   GROUP BY du_report_sku.wid, du_report_sku.storeorderid,
du_report_sku.genreorderid, du_report_sku.sid) du_report_sku_sum USING
(wid, storeorderid, genreorderid, sid);



          View "viwcs.previous_wave"
  Column |         Type          | Modifiers
--------+-----------------------+-----------
  wid    | character varying(10) |
View definition:
  SELECT wave_rxw.wid
    FROM viwcs.wave_rxw
   WHERE wave_rxw.is_previous;




                   Table "core.demand"
    Column    |  Type   |           Modifiers
-------------+---------+--------------------------------
  id          | bigint  | not null default core.new_id()
  target_id   | bigint  | not null
  material_id | bigint  | not null
  qty         | integer | not null
  benefit     | integer | not null default 0
Indexes:
     "demand_pkey" PRIMARY KEY, btree (id)
     "demand_target_id_key" UNIQUE, btree (target_id, material_id)
     "demand_material_id" btree (material_id)
     "demand_target_id" btree (target_id)
Foreign-key constraints:
     "demand_material_id_fkey" FOREIGN KEY (material_id) REFERENCES
core.__material_id(id)
     "demand_target_id_fkey" FOREIGN KEY (target_id) REFERENCES
core.waypoint(id)
Referenced by:
     TABLE "core.inventory" CONSTRAINT "inventory_demand_id_fkey"
FOREIGN KEY (demand_id) REFERENCES core.demand(id)






Thanks,

Richard

Re: Postgres query completion status?

From

Richard Neill

Date:

20 November 2009, 18:17:07

Thom Brown wrote:
  >
> It looks like your statistics are way out of sync with the real data.
>
>  > Nested Loop  (cost=885367.03..1123996.87 rows=8686 width=12) (actual
> time=248577.879..253168.466 rows=347308 loops=1)
>
> This shows that it thinks there will be 8,686 rows, but actually
> traverses 347,308.

Yes, I see what you mean.

>
> Have you manually run a VACUUM on these tables?  Preferrably a full one
> if you can.

Every night, it runs Vacuum verbose analyze on the entire database. We
also have the autovacuum daemon enabled (in the default config).

About 2 weeks ago, I ran cluster followed by vacuum full - which seemed
to help more than I'd expect.

[As I understand it, the statistics shouldn't change very much from day
to day, as long as the database workload remains roughly constant. What
we're actually doing is running a warehouse sorting books - so from one
day to the next the particular book changes, but the overall statistics
basically don't.]

I notice that you appear ot have multiple sorts going on.
> Are all of those actually necessary for your output?

I think so. I didn't actually write all of this, so I can't be certain.

Also consider
> using partial or multicolumn indexes where useful.
>

Already done that. The query was originally pretty quick, with a few
weeks worth of data, but not now. (after a few months). The times don't
rise gradually, but have a very sudden knee.

> And which version of PostgreSQL are you using?

8.4.1, including this patch:
http://archives.postgresql.org/pgsql-bugs/2009-10/msg00118.php

Richard

Re: Postgres query completion status?

From

Thom Brown

Date:

20 November 2009, 18:40:24

2009/11/20 Richard Neill <rn214@cam.ac.uk>

Thom Brown wrote:
>
It looks like your statistics are way out of sync with the real data.

> Nested Loop (cost=885367.03..1123996.87 rows=8686 width=12) (actual time=248577.879..253168.466 rows=347308 loops=1)

This shows that it thinks there will be 8,686 rows, but actually traverses 347,308.

Yes, I see what you mean.

Have you manually run a VACUUM on these tables? Preferrably a full one if you can.

Every night, it runs Vacuum verbose analyze on the entire database. We also have the autovacuum daemon enabled (in the default config).

About 2 weeks ago, I ran cluster followed by vacuum full - which seemed to help more than I'd expect.

[As I understand it, the statistics shouldn't change very much from day to day, as long as the database workload remains roughly constant. What we're actually doing is running a warehouse sorting books - so from one day to the next the particular book changes, but the overall statistics basically don't.]

I notice that you appear ot have multiple sorts going on.
Are all of those actually necessary for your output?

I think so. I didn't actually write all of this, so I can't be certain.

Also consider
using partial or multicolumn indexes where useful.

Already done that. The query was originally pretty quick, with a few weeks worth of data, but not now. (after a few months). The times don't rise gradually, but have a very sudden knee.

And which version of PostgreSQL are you using?

8.4.1, including this patch:
http://archives.postgresql.org/pgsql-bugs/2009-10/msg00118.php

Richard

Okay, have you tried monitoring the connections to your database?

Try: select * from pg_stat_activity;

And this to see current backend connections:

SELECT pg_stat_get_backend_pid(s.backendid) AS procpid,

pg_stat_get_backend_activity(s.backendid) AS current_query

FROM (SELECT pg_stat_get_backend_idset() AS backendid) AS s;

It might also help if you posted your postgresql.conf too.

Thom

Re: Postgres query completion status?

From

Richard Neill

Date:

20 November 2009, 19:14:27

Thom Brown wrote:

>
> Okay, have you tried monitoring the connections to your database?
>
> Try: select * from pg_stat_activity;

Tried that - it's very useful as far as it goes. I can see that in most
cases, the DB is running just the one query.

What I really want to know is, how far through that query has it got?
(For example, if the query is an update, then surely it knows how many
rows have been updated, and how many are yet to go).

>
> And this to see current backend connections:
>
> SELECT pg_stat_get_backend_pid(s.backendid) AS procpid,
>        pg_stat_get_backend_activity(s.backendid) AS current_query
>     FROM (SELECT pg_stat_get_backend_idset() AS backendid) AS s;
>

This looks identical to just some of the columns from pg_stat_activity.


> It might also help if you posted your postgresql.conf too.

Below (have removed the really non-interesting bits).

Thanks,

Richard



>
> Thom


#------------------------------------------------------------------------------
# CONNECTIONS AND AUTHENTICATION
#------------------------------------------------------------------------------

max_connections = 500                   # (change requires restart)

#------------------------------------------------------------------------------
# RESOURCE USAGE (except WAL)
#------------------------------------------------------------------------------

# - Memory -

shared_buffers = 4500MB                 # min 128kB
                                         # (change requires restart)
temp_buffers = 64MB                     # min 800kB
#max_prepared_transactions = 0          # zero disables the feature
                                         # (change requires restart)
# Note:  Increasing max_prepared_transactions costs ~600 bytes of shared
# memory per transaction slot, plus lock space (see
# max_locks_per_transaction).
# It is not advisable to set max_prepared_transactions nonzero unless you
# actively intend to use prepared transactions.

work_mem = 256MB                                # min 64kB
maintenance_work_mem = 256MB            # min 1MB
max_stack_depth = 4MB                   # min 100kB

# - Kernel Resource Usage -

#max_files_per_process = 1000           # min 25
                                         # (change requires restart)
#shared_preload_libraries = ''          # (change requires restart)

# - Cost-Based Vacuum Delay -

#vacuum_cost_delay = 0ms                # 0-100 milliseconds
#vacuum_cost_page_hit = 1               # 0-10000 credits
#vacuum_cost_page_miss = 10             # 0-10000 credits
#vacuum_cost_page_dirty = 20            # 0-10000 credits
#vacuum_cost_limit = 200                # 1-10000 credits

# - Background Writer -

#bgwriter_delay = 200ms                 # 10-10000ms between rounds
#bgwriter_lru_maxpages = 100            # 0-1000 max buffers written/round
#bgwriter_lru_multiplier = 2.0          # 0-10.0 multipler on buffers
scanned/round

# - Asynchronous Behavior -

#effective_io_concurrency = 1           # 1-1000. 0 disables prefetching


#------------------------------------------------------------------------------
# WRITE AHEAD LOG
#------------------------------------------------------------------------------

# - Settings -

#fsync = on                             # turns forced synchronization
on or off
#synchronous_commit = on                # immediate fsync at commit
#wal_sync_method = fsync                # the default is the first option
                                         # supported by the operating
system:
                                         #   open_datasync
                                         #   fdatasync
                                         #   fsync
                                         #   fsync_writethrough
                                         #   open_sync
#full_page_writes = on                  # recover from partial page writes
wal_buffers = 2MB                       # min 32kB
                                         # (change requires restart)
#wal_writer_delay = 200ms               # 1-10000 milliseconds

commit_delay = 50000                    # range 0-100000, in microseconds
commit_siblings = 5                     # range 1-1000

# - Checkpoints -

checkpoint_segments = 64                # in logfile segments, min 1,
16MB each (was safe value of 4)
#checkpoint_timeout = 5min              # range 30s-1h
#checkpoint_completion_target = 0.5     # checkpoint target duration,
0.0 - 1.0
#checkpoint_warning = 30s               # 0 disables

# - Archiving -

#archive_mode = off             # allows archiving to be done
                                 # (change requires restart)
#archive_command = ''           # command to use to archive a logfile
segment
#archive_timeout = 0            # force a logfile segment switch after this
                                 # number of seconds; 0 disables


#------------------------------------------------------------------------------
# QUERY TUNING
#------------------------------------------------------------------------------

# - Planner Method Configuration -

#enable_bitmapscan = on
#enable_hashagg = on
#enable_hashjoin = on
#enable_indexscan = on
#enable_mergejoin = on
#enable_nestloop = on
#enable_seqscan = on
#enable_sort = on
#enable_tidscan = on

# - Planner Cost Constants -

#seq_page_cost = 1                      # measured on an arbitrary scale
#random_page_cost = 4                   # same scale as above
#seq_page_cost = 0.25                   # use 0.25, 0.75 for normal
#random_page_cost = 0.75                # but 1 and 4 for wave-deactivate.
seq_page_cost = 0.5                     # It looks as though 0.5 and 2
(exactly)
random_page_cost = 2                    # will work for both problems.
(very brittle fix!)
#cpu_tuple_cost = 0.01                  # same scale as above
#cpu_index_tuple_cost = 0.005           # same scale as above
#cpu_operator_cost = 0.0025             # same scale as above
effective_cache_size = 10000MB

# - Genetic Query Optimizer -

#geqo = on
geqo_threshold = 12
geqo_effort = 10                        # range 1-10
#geqo_pool_size = 0                     # selects default based on effort
#geqo_generations = 0                   # selects default based on effort
#geqo_selection_bias = 2.0              # range 1.5-2.0

# - Other Planner Options -

default_statistics_target = 1000        # range 1-10000
#constraint_exclusion = partition       # on, off, or partition
#cursor_tuple_fraction = 0.1            # range 0.0-1.0
#from_collapse_limit = 8
#join_collapse_limit = 8                # 1 disables collapsing of explicit
                                         # JOIN clauses


#------------------------------------------------------------------------------
# ERROR REPORTING AND LOGGING
#------------------------------------------------------------------------------


# - When to Log -

#client_min_messages = notice

#log_error_verbosity = default          # terse, default, or verbose
messages

#log_min_error_statement = error

log_min_duration_statement = 80

# - What to Log -

#debug_print_parse = off
#debug_print_rewritten = off
#debug_print_plan = off
#debug_pretty_print = on
#log_checkpoints = off
#log_connections = off
#log_disconnections = off
#log_duration = off
#log_hostname = off
log_line_prefix = '%t '

#log_lock_waits = off                   # log lock waits >= deadlock_timeout
#log_statement = 'none'                 # none, ddl, mod, all
#log_temp_files = -1                    # log temporary files equal or
larger
                                         # than the specified size in
kilobytes;
                                         # -1 disables, 0 logs all temp
files
#log_timezone = unknown                 # actually, defaults to TZ
environment
                                         # setting


#------------------------------------------------------------------------------
# RUNTIME STATISTICS
#------------------------------------------------------------------------------

# - Query/Index Statistics Collector -

#track_activities = on
#track_counts = on
#track_functions = none                 # none, pl, all
#track_activity_query_size = 1024
#update_process_title = on
#stats_temp_directory = 'pg_stat_tmp'


# - Statistics Monitoring -

#log_parser_stats = off
#log_planner_stats = off
#log_executor_stats = off
#log_statement_stats = off


#------------------------------------------------------------------------------
# AUTOVACUUM PARAMETERS
#------------------------------------------------------------------------------

#autovacuum = on                        # Enable autovacuum subprocess?
  'on'
autovacuum = on                                 # requires track_counts
to also be on.
log_autovacuum_min_duration = 1000      # -1 disables, 0 logs all
actions and
                                         # their durations, > 0 logs only
                                         # actions running at least this
number
                                         # of milliseconds.
#autovacuum_max_workers = 3             # max number of autovacuum
subprocesses
#autovacuum_naptime = 1min              # time between autovacuum runs
#autovacuum_vacuum_threshold = 50       # min number of row updates before
                                         # vacuum
#autovacuum_analyze_threshold = 50      # min number of row updates before
                                         # analyze
#autovacuum_vacuum_scale_factor = 0.2   # fraction of table size before
vacuum
#autovacuum_analyze_scale_factor = 0.1  # fraction of table size before
analyze
#autovacuum_freeze_max_age = 200000000  # maximum XID age before forced
vacuum
                                         # (change requires restart)
#autovacuum_vacuum_cost_delay = 20ms    # default vacuum cost delay for
                                         # autovacuum, in milliseconds;
                                         # -1 means use vacuum_cost_delay
#autovacuum_vacuum_cost_limit = -1      # default vacuum cost limit for
                                         # autovacuum, -1 means use
                                         # vacuum_cost_limit

Re: Postgres query completion status?

From

Thom Brown

Date:

20 November 2009, 19:36:28

2009/11/20 Richard Neill <rn214@cam.ac.uk>

It might also help if you posted your postgresql.conf too.

Below (have removed the really non-interesting bits).

Thanks,

Richard

I can't actually see anything in your config that would cause this problem. :/

As for seeing the progress of an update, I would have thought due to the atomic nature of updates, only the transaction in which the update is running would have visibility of the as-yet uncommitted updates.

Thom

Re: Postgres query completion status?

From

Richard Neill

Date:

20 November 2009, 19:45:23

Thom Brown wrote:
> 2009/11/20 Richard Neill <rn214@cam.ac.uk <mailto:rn214@cam.ac.uk>>
>
>
>         It might also help if you posted your postgresql.conf too.
>
>
>     Below (have removed the really non-interesting bits).
>
>     Thanks,
>
>     Richard
>
>
> I can't actually see anything in your config that would cause this
> problem. :/
>
> As for seeing the progress of an update, I would have thought due to the
> atomic nature of updates, only the transaction in which the update is
> running would have visibility of the as-yet uncommitted updates.
>

Yes, but surely the postmaster itself (and any administrative user)
should be able to find this out.

What I need for slow queries is some kind of progress bar. Any estimate
(no matter how poor, or non-linear) of the query progress, or time
remaining would be invaluable.

Richard

Re: Postgres query completion status?

From

"Fernando Hevia"

Date:

20 November 2009, 20:06:22


> -----Mensaje original-----
> De: Richard Neill
>
>
> max_connections = 500                   # (change requires restart)
> work_mem = 256MB                                # min 64kB

Not that it has to do with your current problem but this combination could
bog your server if enough clients run sorted queries simultaneously.
You probably should back on work_mem at least an order of magnitude.

Re: Postgres query completion status?

From

Richard Neill

Date:

20 November 2009, 20:13:33

Fernando Hevia wrote:
>
>
>> -----Mensaje original-----
>> De: Richard Neill
>>
>>
>> max_connections = 500                   # (change requires restart)
>> work_mem = 256MB                                # min 64kB
>
> Not that it has to do with your current problem but this combination could
> bog your server if enough clients run sorted queries simultaneously.
> You probably should back on work_mem at least an order of magnitude.
>

What's the correct way to configure this?

* We have one client which needs to run really big transactions
(therefore needs the work memory).

* We also have about 200 clients which run always very small, short queries.

Richard

Re: Postgres query completion status?

From

"Fernando Hevia"

Date:

20 November 2009, 20:30:55


> -----Mensaje original-----
> De: Richard Neill
>
> Fernando Hevia wrote:
> >
> >
> >> -----Mensaje original-----
> >> De: Richard Neill
> >>
> >>
> >> max_connections = 500                   # (change requires restart)
> >> work_mem = 256MB                                # min 64kB
> >
> > Not that it has to do with your current problem but this
> combination
> > could bog your server if enough clients run sorted queries
> simultaneously.
> > You probably should back on work_mem at least an order of magnitude.
> >
>
> What's the correct way to configure this?
>
> * We have one client which needs to run really big
> transactions (therefore needs the work memory).
>
> * We also have about 200 clients which run always very small,
> short queries.
>
> Richard
>

Set the default value at postgresql.conf much lower, probably 4MB.
And just before running any big transaction raise it for
the current session only issuing a:
  set work_mem = '256MB';

Regards,
Fernando.

Re: Postgres query completion status?

From

marcin mank

Date:

20 November 2009, 20:38:47

>>> max_connections = 500                   # (change requires restart)
>>> work_mem = 256MB                                # min 64kB
>>
>> Not that it has to do with your current problem but this combination could
>> bog your server if enough clients run sorted queries simultaneously.
>> You probably should back on work_mem at least an order of magnitude.
>>
>
> What's the correct way to configure this?
>
> * We have one client which needs to run really big transactions (therefore
> needs the work memory).
>

You can set the work_mem for the specific user (like "set work_mem to
x") at the begginning of the session.

Here are some things I noticed (it is more like shooting in the dark,
but still...)

the expensive part is this:
                                     ->  Sort
(cost=280201.66..281923.16 rows=688602 width=300) (actual
time=177511.806..183486.593 rows=41317448 loops=1)

                                            Sort Key:
du_report_sku.wid, du_report_sku.storeorderid,
du_report_sku.genreorderid

                                            Sort Method:  external
sort  Disk: 380768kB
                                            ->  HashAggregate
(cost=197936.75..206544.27 rows=688602 width=36) (actual
time=7396.426..11224.839 rows=6282564 loops=1)
                                                  ->  Seq Scan on
du_report_sku  (cost=0.00..111861.61 rows=6886011 width=36) (actual
time=0.006..573.419 rows=6897682 loops=1)


(it is pretty confusing that the HashAggregate reports ~6M rows, but
the sort does 41M rows, but maybe I can not read this).
Anyway, I think that if You up the work_mem for this query to 512M,
the sort will be in memory, an thus plenty faster.

Also, You say You are experiencing unstable query plans, and this may
mean that geqo is kicking in (but Your query seems too simple for
that, even considering the views involved). A quick way to check that
would be to run explain <the query> a coule tens of times, and check
if the plans change. If they do, try upping geqo_threshold.

You have seq_page_cost 4 times larger than random_page_cost. You say
You are on SSD, so there is no random access penalty. Try setting them
equal.

Your plan is full of merge-joins, some indices may be in order. Merge
join is a kind of "last-chance" plan.

the query is :
SELECT ( core.demand.qty - viwcs.wave_end_demand.qty_remaining ) FROM
core.demand, viwcs.previous_wave LEFT OUTER JOIN viwcs.wave_end_demand
USING ( wid ) WHERE core.demand.id = viwcs.wave_end_demand.demand_id;

Isn`t the left join equivalent to an inner join, since in where You
are comparing values from the outer side of the join? If they come out
nulls, they will get discarded anyway...

I hope You find some of this useful.

Greetings
Marcin

Re: Postgres query completion status?

From

Richard Neill

Date:

20 November 2009, 20:40:00

Justin Pitts wrote:
> Set work_mem in postgresql.conf down to what the 200 clients need, which
> sounds to me like the default setting.
>
> In the session which needs more work_mem, execute:
> SET SESSION work_mem TO '256MB'

Isn't that terribly ugly? It seems to me less hackish to rely on the
many clients not to abuse work_mem (as we know exactly what query they
will run, we can be sure it won't happen).

It's a shame that the work_mem parameter is a per-client one, rather
than a single big pool.

Richard

Re: Postgres query completion status?

From

Greg Smith

Date:

20 November 2009, 23:07:44

Richard Neill wrote:
> Am I missing something though, or is this project dormant, without
> having released any files?

My bad--gave you the wrong url.
http://git.postgresql.org/gitweb?p=pg_top.git;a=summary has the project
I meant to point you toward.

> What I really want to know is, how far through that query has it got?
> (For example, if the query is an update, then surely it knows how many
> rows have been updated, and how many are yet to go).
I understand what you want.  The reason you're not getting any
suggestions is because that just isn't exposed in PostgreSQL yet.
Clients ask for queries to be run, eventually they get rows of results
back, but there's no notion of how many they're going to get in advance
or how far along they are in executing the query's execution plan.
There's a couple of academic projects that have started exposing more of
the query internals, but I'm not aware of anyone who's even started
moving in the direction of what you'd need to produce a progress bar.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com

Re: Postgres query completion status?

From

Greg Smith

Date:

20 November 2009, 23:18:29

Richard Neill wrote:
> Likewise, is there any way to check whether, for example, postgres is
> running out of work memory?
It doesn't work like that; it's not an allocation.  What happens is that
the optimizer estimates how much memory a sort is going to need, and
then uses work_mem to decide whether that is something it can do in RAM
or something that needs to be done via a more expensive disk-based
sorting method.  You can tell if it's not set high enough by toggling on
log_temp_files and watching when those get created--those appear when
sorts bigger than work_mem need to be done.

> commit_delay = 50000                    # range 0-100000, in microseconds
> commit_siblings = 5                     # range 1-1000

Random note:  that is way too high of a value for commit_delay.  It's
unlikely to be helping you, and might be hurting sometimes.  The whole
commit_delay feature is quite difficult to tune correctly, and is really
only useful for situations where there's really heavy writing going on
and you want to carefully tweak write chunking size.  The useful range
for commit_delay is very small even in that situation, 50K is way too
high.  I'd recommend changing this back to the default, if you're not at
the point where you're running your own benchmarks to prove the
parameter is useful to you it's not something you should try to adjust.

--
Greg Smith    2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com  www.2ndQuadrant.com

Re: Postgres query completion status?

From

Richard Neill

Date:

22 November 2009, 14:10:24

Thanks very much for your help so far.
>
> (it is pretty confusing that the HashAggregate reports ~6M rows, but
> the sort does 41M rows, but maybe I can not read this).
> Anyway, I think that if You up the work_mem for this query to 512M,
> the sort will be in memory, an thus plenty faster.

Tried this (with work_mem 2GB). It seems to make a difference, but not
enough: the query time is about halved (from 220 sec to 120 sec)

>
> Also, You say You are experiencing unstable query plans, and this may
> mean that geqo is kicking in (but Your query seems too simple for
> that, even considering the views involved). A quick way to check that
> would be to run explain <the query> a coule tens of times, and check
> if the plans change. If they do, try upping geqo_threshold.

It's not unstable from one run to the next; it's unstable from one day
to the next (more on this later)

>
> You have seq_page_cost 4 times larger than random_page_cost. You say
> You are on SSD, so there is no random access penalty. Try setting them
> equal.
>

Again, experimentally, it seems to be non-equal. I didn't benchmark
this, but the random access tests done by TomsHardware et al suggest a
factor 2.5 penalty for random access vs sequential. This is very much
better than rotational disks, but still significant.


> Your plan is full of merge-joins, some indices may be in order. Merge
> join is a kind of "last-chance" plan.
>

I think the fix here is going to be to do more work at write-time and
less at read-time. i.e. rather than having really complex views, we'll
generate some extra tables, and keep them synchronized with triggers.


Richard

Re: Postgres query completion status?

From

Richard Neill

Date:

22 November 2009, 14:10:30


Justin Pitts wrote:
> I don't know if I would call it "terribly" ugly. Its not especially
> pretty, but it affords the needed degree of twiddling to get the job
> done. Relying on the clients is fine - if you can. I suspect the vast
> majority of DBAs would find that notion unthinkable. The usual result of
> a memory overrun is a server crash.
>

It's probably OK in this context: the multiple clients are all instances
of the same perl script, running particular, pre-defined queries. So we
can trust them not to ask a really memory-intensive query.

Besides which, if you can't trust the clients to ask sensible queries,
why can you trust them to set their own work_mem values?

Richard




> On Nov 20, 2009, at 4:39 PM, Richard Neill wrote:
>
>> Justin Pitts wrote:
>>> Set work_mem in postgresql.conf down to what the 200 clients need,
>>> which sounds to me like the default setting.
>>> In the session which needs more work_mem, execute:
>>> SET SESSION work_mem TO '256MB'
>>
>> Isn't that terribly ugly? It seems to me less hackish to rely on the
>> many clients not to abuse work_mem (as we know exactly what query they
>> will run, we can be sure it won't happen).
>>
>> It's a shame that the work_mem parameter is a per-client one, rather
>> than a single big pool.
>>
>> Richard
>>
>> --
>> Sent via pgsql-performance mailing list
>> (pgsql-performance@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
>
>

Re: Postgres query completion status?

From

Richard Neill

Date:

22 November 2009, 14:14:31

Greg Smith wrote:
> Richard Neill wrote:
>> Am I missing something though, or is this project dormant, without
>> having released any files?
>
> My bad--gave you the wrong url.
> http://git.postgresql.org/gitweb?p=pg_top.git;a=summary has the project
> I meant to point you toward.

Will try that out...

>
>> What I really want to know is, how far through that query has it got?
>> (For example, if the query is an update, then surely it knows how many
>> rows have been updated, and how many are yet to go).
> I understand what you want.  The reason you're not getting any
> suggestions is because that just isn't exposed in PostgreSQL yet.
> Clients ask for queries to be run, eventually they get rows of results
> back, but there's no notion of how many they're going to get in advance
> or how far along they are in executing the query's execution plan.
> There's a couple of academic projects that have started exposing more of
> the query internals, but I'm not aware of anyone who's even started
> moving in the direction of what you'd need to produce a progress bar.
>

Is there any internal table (similar to pg_stat_activity) I can look at?

Richard